just added everything fuck this
|
@ -11,3 +11,6 @@ build-dir = "public"
|
|||
[preprocesssor.toc]
|
||||
command = "./mdbook-toc"
|
||||
renderer = ["html"]
|
||||
|
||||
[output.html]
|
||||
mathjax-support = true
|
||||
|
|
BIN
src/Pictures/aa.png
Normal file
After Width: | Height: | Size: 187 KiB |
BIN
src/Pictures/adg_mask.png
Normal file
After Width: | Height: | Size: 128 KiB |
BIN
src/Pictures/banding_after3.png
Normal file
After Width: | Height: | Size: 47 KiB |
BIN
src/Pictures/banding_before3.png
Normal file
After Width: | Height: | Size: 33 KiB |
BIN
src/Pictures/banding_graining.png
Normal file
After Width: | Height: | Size: 119 KiB |
BIN
src/Pictures/bbmod_fix.png
Normal file
After Width: | Height: | Size: 49 KiB |
BIN
src/Pictures/bbmod_src.png
Normal file
After Width: | Height: | Size: 48 KiB |
BIN
src/Pictures/bicubic.png
Normal file
After Width: | Height: | Size: 301 KiB |
BIN
src/Pictures/bicubic_params.png
Normal file
After Width: | Height: | Size: 99 KiB |
BIN
src/Pictures/bilinear_after2.png
Normal file
After Width: | Height: | Size: 339 KiB |
BIN
src/Pictures/bilinear_before2.png
Normal file
After Width: | Height: | Size: 335 KiB |
BIN
src/Pictures/cfx_bbm.png
Normal file
After Width: | Height: | Size: 90 KiB |
BIN
src/Pictures/continuityfixer.png
Normal file
After Width: | Height: | Size: 51 KiB |
BIN
src/Pictures/debandmask.png
Normal file
After Width: | Height: | Size: 42 KiB |
BIN
src/Pictures/debandmask_comparison.png
Normal file
After Width: | Height: | Size: 82 KiB |
BIN
src/Pictures/dering.png
Normal file
After Width: | Height: | Size: 43 KiB |
BIN
src/Pictures/detail_mask.png
Normal file
After Width: | Height: | Size: 199 KiB |
BIN
src/Pictures/detint.png
Normal file
After Width: | Height: | Size: 137 KiB |
BIN
src/Pictures/dirt.png
Normal file
After Width: | Height: | Size: 7.7 KiB |
BIN
src/Pictures/dither_comparison_8.png
Normal file
After Width: | Height: | Size: 236 KiB |
BIN
src/Pictures/edgemasks_fastsobel2.png
Normal file
After Width: | Height: | Size: 83 KiB |
BIN
src/Pictures/edgemasks_kirsch.png
Normal file
After Width: | Height: | Size: 160 KiB |
BIN
src/Pictures/edgemasks_kirsch2.png
Normal file
After Width: | Height: | Size: 110 KiB |
BIN
src/Pictures/edgemasks_prewitt.png
Normal file
After Width: | Height: | Size: 50 KiB |
BIN
src/Pictures/edgemasks_retinex.png
Normal file
After Width: | Height: | Size: 143 KiB |
BIN
src/Pictures/edgemasks_retinex2.png
Normal file
After Width: | Height: | Size: 311 KiB |
BIN
src/Pictures/edgemasks_sobel.png
Normal file
After Width: | Height: | Size: 61 KiB |
BIN
src/Pictures/edgemasks_src.png
Normal file
After Width: | Height: | Size: 250 KiB |
BIN
src/Pictures/edgemasks_src2.png
Normal file
After Width: | Height: | Size: 186 KiB |
BIN
src/Pictures/edgemasks_tcanny.png
Normal file
After Width: | Height: | Size: 8.1 KiB |
BIN
src/Pictures/expr_limit.png
Normal file
After Width: | Height: | Size: 353 KiB |
BIN
src/Pictures/fb_luma.png
Normal file
After Width: | Height: | Size: 23 KiB |
BIN
src/Pictures/fb_lumachroma.png
Normal file
After Width: | Height: | Size: 26 KiB |
BIN
src/Pictures/fb_src.png
Normal file
After Width: | Height: | Size: 20 KiB |
BIN
src/Pictures/fox.png
Normal file
After Width: | Height: | Size: 66 KiB |
BIN
src/Pictures/gamma.png
Normal file
After Width: | Height: | Size: 215 KiB |
BIN
src/Pictures/geek.png
Normal file
After Width: | Height: | Size: 79 KiB |
BIN
src/Pictures/gradfun3.png
Normal file
After Width: | Height: | Size: 150 KiB |
BIN
src/Pictures/gradfun3_mask.png
Normal file
After Width: | Height: | Size: 101 KiB |
BIN
src/Pictures/hbd_example.png
Normal file
After Width: | Height: | Size: 253 KiB |
BIN
src/Pictures/improper_borders.png
Normal file
After Width: | Height: | Size: 9 KiB |
BIN
src/Pictures/improper_improper_borders.png
Normal file
After Width: | Height: | Size: 1.9 KiB |
BIN
src/Pictures/lion.png
Normal file
After Width: | Height: | Size: 1.5 MiB |
BIN
src/Pictures/luma_mask.png
Normal file
After Width: | Height: | Size: 200 KiB |
BIN
src/Pictures/masked_deband.png
Normal file
After Width: | Height: | Size: 123 KiB |
BIN
src/Pictures/masked_debanding.png
Normal file
After Width: | Height: | Size: 110 KiB |
BIN
src/Pictures/matrix_burning.png
Normal file
After Width: | Height: | Size: 197 KiB |
BIN
src/Pictures/matrix_conversion.png
Normal file
After Width: | Height: | Size: 562 KiB |
BIN
src/Pictures/rektlvls_fix.png
Normal file
After Width: | Height: | Size: 11 KiB |
BIN
src/Pictures/rektlvls_src.png
Normal file
After Width: | Height: | Size: 10 KiB |
BIN
src/Pictures/resize_downup.png
Normal file
After Width: | Height: | Size: 423 KiB |
BIN
src/Pictures/resize_downup2.png
Normal file
After Width: | Height: | Size: 442 KiB |
BIN
src/Pictures/resizers_down.png
Normal file
After Width: | Height: | Size: 376 KiB |
BIN
src/Pictures/resizers_up.png
Normal file
After Width: | Height: | Size: 169 KiB |
BIN
src/Pictures/retinex.png
Normal file
After Width: | Height: | Size: 1.1 MiB |
BIN
src/Pictures/retinex_binarize_maximum_inflate.png
Normal file
After Width: | Height: | Size: 83 KiB |
BIN
src/Pictures/retinex_edgemask.png
Normal file
After Width: | Height: | Size: 150 KiB |
BIN
src/Pictures/sobel.png
Normal file
After Width: | Height: | Size: 16 KiB |
BIN
src/Pictures/sobel_manipulated.png
Normal file
After Width: | Height: | Size: 48 KiB |
BIN
src/Pictures/yuv420vsyuv444.png
Normal file
After Width: | Height: | Size: 140 KiB |
|
@ -5,19 +5,19 @@
|
|||
- [Loading the Video]()
|
||||
- [Cropping]()
|
||||
- [Resizing](filtering/resizing.md)
|
||||
- [Descaling]()
|
||||
- [Descaling](filtering/descaling.md)
|
||||
- [Bit Depths and Dither Algorithms]()
|
||||
- [Debanding]()
|
||||
- [Dirty Lines and Border Issues]()
|
||||
- [Anti-Aliasing]()
|
||||
- [Deringing]()
|
||||
- [Dehaloing]()
|
||||
- [Denoising]()
|
||||
- [Graining]()
|
||||
- [Deblocking]()
|
||||
- [Detinting and Level Adjustment]()
|
||||
- [Dehardsubbing and Delogoing]()
|
||||
- [Masking]()
|
||||
- [Debanding](filtering/debanding.md)
|
||||
- [Dirty Lines and Border Issues](filtering/dirty_lines.md)
|
||||
- [Anti-Aliasing](filtering/anti-aliasing.md)
|
||||
- [Deringing](filtering/deringing.md)
|
||||
- [Dehaloing](filtering/dehaloing.md)
|
||||
- [Denoising](filtering/denoising.md)
|
||||
- [Graining](filtering/graining.md)
|
||||
- [Deblocking](filtering/deblocking.md)
|
||||
- [Detinting and Level Adjustment](filtering/detinting.md)
|
||||
- [Dehardsubbing and Delogoing](filtering/dehardsubbing.md)
|
||||
- [Masking](filtering/masking.md)
|
||||
- [Encoding]()
|
||||
- [Test Encodes]()
|
||||
- [x264 Settings]()
|
||||
|
|
222
src/filtering/anti-aliasing.md
Normal file
|
@ -0,0 +1,222 @@
|
|||
This is likely the most commonly known issue. If you want to fix this,
|
||||
first make sure the issue stems from actual aliasing, not poor
|
||||
upscaling. If you've done this, the tool I'd recommend is the `TAAmbk`
|
||||
suite[^33]:
|
||||
|
||||
import vsTAAmbk as taa
|
||||
|
||||
aa = taa.TAAmbk(clip, aatype=1, aatypeu=None, aatypev=None, preaa=0, strength=0.0, cycle=0,
|
||||
mtype=None, mclip=None, mthr=None, mthr2=None, mlthresh=None,
|
||||
mpand=(1, 0), txtmask=0, txtfade=0, thin=0, dark=0.0, sharp=0, aarepair=0,
|
||||
postaa=None, src=None, stabilize=0, down8=True, showmask=0, opencl=False,
|
||||
opencl_device=0, **args)
|
||||
|
||||
The GitHub README is quite extensive, but some additional comments are
|
||||
necessary:
|
||||
|
||||
- `aatype`: (Default: 1)\
|
||||
The value here can either be a number to indicate the AA type for
|
||||
the luma plane, or it can be a string to do the same.
|
||||
|
||||
0: lambda clip, *args, **kwargs: type('', (), {'out': lambda: clip}),
|
||||
1: AAEedi2,
|
||||
2: AAEedi3,
|
||||
3: AANnedi3,
|
||||
4: AANnedi3UpscaleSangNom,
|
||||
5: AASpline64NRSangNom,
|
||||
6: AASpline64SangNom,
|
||||
-1: AAEedi2SangNom,
|
||||
-2: AAEedi3SangNom,
|
||||
-3: AANnedi3SangNom,
|
||||
'Eedi2': AAEedi2,
|
||||
'Eedi3': AAEedi3,
|
||||
'Nnedi3': AANnedi3,
|
||||
'Nnedi3UpscaleSangNom': AANnedi3UpscaleSangNom,
|
||||
'Spline64NrSangNom': AASpline64NRSangNom,
|
||||
'Spline64SangNom': AASpline64SangNom,
|
||||
'Eedi2SangNom': AAEedi2SangNom,
|
||||
'Eedi3SangNom': AAEedi3SangNom,
|
||||
'Nnedi3SangNom': AANnedi3SangNom,
|
||||
'PointSangNom': AAPointSangNom,
|
||||
|
||||
The ones I would suggest are `Eedi3`, `Nnedi3`, `Spline64SangNom`,
|
||||
and `Nnedi3SangNom`. Both of the `SangNom` modes are incredibly
|
||||
destructive and should only be used if absolutely necessary.
|
||||
`Nnedi3` is usually your best option; it's not very strong or
|
||||
destructive, but often good enough, and is fairly fast. `Eedi3` is
|
||||
unbelievably slow, but stronger than `Nnedi3` and not as destructive
|
||||
as the `SangNom` modes.
|
||||
|
||||
- `aatypeu`: (Default: same as `aatype`)\
|
||||
Select main AA kernel for U plane when clip's format is YUV.
|
||||
|
||||
- `aatypev`: (Default: same as `aatype`)\
|
||||
Select main AA kernel for V plane when clip's format is YUV.
|
||||
|
||||
- `strength`: (Default: 0)\
|
||||
The strength of predown. Valid range is $[0, 0.5]$ Before applying
|
||||
main AA kernel, the clip will be downscaled to
|
||||
($1-$strength)$\times$clip\_resolution first and then be upscaled to
|
||||
original resolution by main AA kernel. This may be benefit for clip
|
||||
which has terrible aliasing commonly caused by poor upscaling.
|
||||
Automatically disabled when using an AA kernel which is not suitable
|
||||
for upscaling. If possible, do not raise this, and *never* lower it.
|
||||
|
||||
- `preaa`: (Default: 0)\
|
||||
Select the preaa mode.
|
||||
|
||||
- 0: No preaa
|
||||
|
||||
- 1: Vertical
|
||||
|
||||
- 2: Horizontal
|
||||
|
||||
- -1: Both
|
||||
|
||||
Perform a `preaa` before applying main AA kernel. `Preaa` is
|
||||
basically a simplified version of `daa`. Pretty useful for dealing
|
||||
with residual comb caused by poor deinterlacing. Otherwise, don't
|
||||
use it.
|
||||
|
||||
- `cycle`: (Default: 0)\
|
||||
Set times of loop of main AA kernel. Use for very very terrible
|
||||
aliasing and 3D aliasing.
|
||||
|
||||
- `mtype`: (Default: 1)\
|
||||
Select type of edge mask to be used. Currently there are three mask
|
||||
types:
|
||||
|
||||
- 0: No mask
|
||||
|
||||
- 1: `Canny` mask
|
||||
|
||||
- 2: `Sobel` mask
|
||||
|
||||
- 3: `Prewitt` mask
|
||||
|
||||
Mask always be built under 8-bit scale. All of these options are
|
||||
fine, but you may want to test them and see what ends up looking the
|
||||
best.
|
||||
|
||||
- `mclip`: (Default: None)\
|
||||
Use your own mask clip instead of building one. If `mclip` is set,
|
||||
script won't build another one. And you should take care of mask's
|
||||
resolution, bit-depth, format, etc by yourself.
|
||||
|
||||
- `mthr`:\
|
||||
Size of the mask. The smaller value you give, the bigger mask you
|
||||
will get.
|
||||
|
||||
- `mlthresh`: (Default None)\
|
||||
Set luma thresh for n-pass mask. Use a list or tuple to specify the
|
||||
sections of luma.
|
||||
|
||||
- `mpand`: (Default: (1, 0))\
|
||||
Use a list or tuple to specify the loop of mask expanding and mask
|
||||
inpanding.
|
||||
|
||||
- `txtmask`: (Default: 0)\
|
||||
Create a mask to protect white captions on screen. Value is the
|
||||
threshold of luma. Valid range is 0 255. When a area whose luma is
|
||||
greater than threshold and chroma is $128\pm2$, it will be
|
||||
considered as a caption.
|
||||
|
||||
- `txtfade`: (Default: 0)\
|
||||
Set the length of fading. Useful for fading text.
|
||||
|
||||
- `thin`: (Default: 0)\
|
||||
Warp the line by aWarpSharp2 before applying main AA kernel.
|
||||
|
||||
- `dark`: (Default: 0.0)\
|
||||
Darken the line by Toon before applying main AA kernel.
|
||||
|
||||
- `sharp`: (Default: 0)\
|
||||
Sharpen the clip after applying main AA kernel. \* 0: No sharpen. \*
|
||||
1 inf: LSFmod(defaults='old') \* 0 1: Simlar to Avisynth's
|
||||
sharpen() \* -1 0: LSFmod(defaults='fast') \* -1: Contra-Sharpen
|
||||
|
||||
Whatever type of sharpen, larger absolute value of sharp means
|
||||
larger strength of sharpen.
|
||||
|
||||
- `aarepair`: (Default: 0)\
|
||||
Use repair to remove artifacts introduced by main AA kernel.
|
||||
According to different repair mode, the pixel in src clip will be
|
||||
replaced by the median or average in 3x3 neighbour of processed
|
||||
clip. It's highly recommend to use repair when main AA kernel
|
||||
contain SangNom. For more information, check
|
||||
<http://www.vapoursynth.com/doc/plugins/rgvs.html#rgvs.Repair>. Hard
|
||||
to get it to work properly.
|
||||
|
||||
- `postaa`: (Default: False)\
|
||||
Whether use soothe to counter the aliasing introduced by sharpening.
|
||||
|
||||
- `src`: (Default: clip)\
|
||||
Use your own `src` clip for sharp, repair, mask merge, etc.
|
||||
|
||||
- `stabilize`: (Default: 0)\
|
||||
Stabilize the temporal changes by `MVTools`. Value is the temporal
|
||||
radius. Valid range is $[0, 3]$.
|
||||
|
||||
- `down8`: (Default: True)\
|
||||
If you set this to `True`, the clip will be down to 8-bit before
|
||||
applying main AA kernel and up it back to original bit-depth after
|
||||
applying main AA kernel. `LimitFilter` will be used to reduce the
|
||||
loss in depth conversion.
|
||||
|
||||
- `showmask`: (Default: 0)\<br/\> Output the mask instead of processed
|
||||
clip if you set it to not 0. 0: Normal output; 1: Mask only; 2: tack
|
||||
mask and clip; 3: Interleave mask and clip; -1: Text mask only
|
||||
|
||||
- `opencl`: (Default: False) Whether use opencl version of some
|
||||
plugins. Currently there are three plugins that can use opencl:
|
||||
|
||||
- TCannyCL
|
||||
|
||||
- EEDI3CL
|
||||
|
||||
- NNEDI3CL
|
||||
|
||||
This may speed up the process, which is obviously great, since
|
||||
anti-aliasing is usually very slow.
|
||||
|
||||
- `opencl_device`: (Default: 0)\
|
||||
Select an opencl device. To find out which one's the correct one, do
|
||||
|
||||
core.nnedi3cl.NNEDI3CL(clip, 1, list_device=True).set_output()
|
||||
|
||||
- `other parameters`:\
|
||||
Will be collected into a dict for particular aatype.
|
||||
|
||||
In most cases, `aatype=3` works well, and it's the method with the least
|
||||
detail loss. Alternatively, `aatype=2` can also provide decent results.
|
||||
What these two do is upscale the clip with `nnedi3` or the blurrier (and
|
||||
slower) `eedi3` and downscale back, whereby the interpolation done often
|
||||
fixes aliasing artifacts.
|
||||
|
||||
Note that there are a ton more very good anti-aliasing methods, as well
|
||||
as many different mask types you can use (e.g. other edgemasks, clamping
|
||||
one method's changes to those of another method etc.). However, most
|
||||
methods are based on very similar methods to those `TAA` implements.
|
||||
|
||||
An interesting suite for anti-aliasing is the one found in
|
||||
`lvsfunc`[^34]. It provides various functions that are very useful for
|
||||
fixing strong aliasing that a normal `TAA` call might not handle very
|
||||
well. The documentation for it[^35] is fantastic, so this guide won't go
|
||||
deeper into its functions.
|
||||
|
||||
If your entire video suffers from aliasing, it's not all too unlikely
|
||||
that you're dealing with a cheap upscale. In this case, descale or
|
||||
resize first before deciding whether you need to perform any
|
||||
anti-aliasing.
|
||||
|
||||
Here's an example of an anti-aliasing fix from Non Non Biyori -
|
||||
Vacation:
|
||||
|
||||
![Source with aliasing on left, filtered on
|
||||
right.](Pictures/aa.png){#fig:5}
|
||||
|
||||
In this example, the following was done:
|
||||
|
||||
mask = kgf.retinex_edgemask(src).std.Binarize(65500).std.Maximum().std.Inflate()
|
||||
aa = taa.TAAmbk(src, aatype=2, mtype=0, opencl=True)
|
||||
out = core.std.MaskedMerge(src, aa, mask)
|
54
src/filtering/bit_depths.md
Normal file
|
@ -0,0 +1,54 @@
|
|||
Although not necessary to work in if you're exporting in the same bit
|
||||
depth as your source, working in high bit depth and dithering down at
|
||||
the end of your filter chain is recommended in order to avoid rounding
|
||||
errors, which can lead to artifacts such as banding (an example is in
|
||||
figure [20](#fig:flvls){reference-type="ref" reference="fig:flvls"}).
|
||||
Luckily, if you choose not to write your script in high bit depth, most
|
||||
plugins will work in high bit depth internally. As dithering is quite
|
||||
fast and higher depths do lead to better precision, there's usually no
|
||||
reason not to work in higher bit depths other than some functions
|
||||
written for 8-bit being slightly slower.
|
||||
|
||||
If you'd like to learn more about dithering, the Wikipedia page[^12] is
|
||||
quite informative. There are also a lot of research publications worth
|
||||
reading. What you need to understand here is that your dither method
|
||||
only matters if there's an actual difference between your source and the
|
||||
filtering you perform. As dither is an alternative of sorts to rounding
|
||||
to different bit depths, only offsets from actual integers will have
|
||||
differences. Some algorithms might be better at different things from
|
||||
others, hence it can be worth it to go with non-standard algorithms. For
|
||||
example, if you want to deband something and export it in 8-bit but are
|
||||
having issues with compressing it properly, you might want to consider
|
||||
ordered dithering, as it's known to perform slightly better in this case
|
||||
(although it doesn't look as nice). To do this, use the following code:
|
||||
|
||||
source_16 = fvf.Depth(src, 16)
|
||||
|
||||
deband = core.f3kdb.Deband(source_16, output_depth=16)
|
||||
|
||||
out = fvf.Depth(deband, 8, dither='ordered')
|
||||
|
||||
Again, this will only affect the actual debanded area. This isn't really
|
||||
recommended most of the time, as ordered dither is rather unsightly, but
|
||||
it's certainly worth considering if you're having trouble compressing a
|
||||
debanded area. You should obviously be masking and adjusting your
|
||||
debander's parameters, but more on that later.
|
||||
|
||||
In order to dither up or down, you can use the `Depth` function within
|
||||
either `fvsfunc`[^13] (fvf) or `mvsfunc`[^14] (mvf). The difference
|
||||
between these two is that fvf uses internal resizers, while mvf uses
|
||||
internal whenever possible, but also supports `fmtconv`, which is slower
|
||||
but has more dither (and resize) options. Both feature the standard
|
||||
Filter Lite error\_diffusion dither type, however, so if you just roll
|
||||
with the defaults, I'd recommend fvf. To illustrate the difference
|
||||
between good and bad dither, some examples are included in the appendix
|
||||
under figure [19](#fig:12){reference-type="ref" reference="fig:12"}. Do
|
||||
note you may have to zoom in quite far to spot the difference. Some PDF
|
||||
viewers may also incorrectly output the image.
|
||||
|
||||
I'd recommend going with Filter Lite (fvf's default or
|
||||
`mvf.Depth(dither=3)`, also the default) most of the time. Others like
|
||||
Ostromoukhov (`mvf.Depth(dither=7)`), void and cluster
|
||||
(`fmtc.bitdepth(dither=8)`), standard Bayer ordered
|
||||
(`fvf.Depth(dither=’ordered’)` or `mvf.Depth(dither=0)`) can also be
|
||||
useful sometimes. Filter Lite will usually be fine, though.
|
155
src/filtering/debanding.md
Normal file
|
@ -0,0 +1,155 @@
|
|||
This is the most common issue one will encounter. Banding usually
|
||||
happens when bitstarving and poor settings lead to smoother gradients
|
||||
becoming abrupt color changes, which obviously ends up looking bad. The
|
||||
good news is higher bit depths can help with this issue, since more
|
||||
values are available to create the gradients. Because of this, lots of
|
||||
debanding is done in 16 bit, then dithered down to 10 or 8 bit again
|
||||
after the filter process is done.\
|
||||
One important thing to note about debanding is that you should try to
|
||||
always use a mask with it, e.. an edge mask or similar. See
|
||||
[the masking section](masking) for
|
||||
details!\
|
||||
There are three great tools for VapourSynth that are used to fix
|
||||
banding: [`neo_f3kdb`](https://github.com/HomeOfAviSynthPlusEvolution/neo_f3kdb/), `fvsfunc`'s `gradfun3`, which has a built-in
|
||||
mask, and `vs-placebo`'s `placebo.Deband`.\
|
||||
Let's take a look at `neo_f3kdb` first: The default relevant code for
|
||||
VapourSynth looks as follows:
|
||||
|
||||
```py
|
||||
deband = core.neo_f3kdb.deband(src=clip, range=15, y=64, cb=64, cr=64, grainy=64, grainc=64, dynamic_grain=False, output_depth=16, sample_mode=2)
|
||||
```
|
||||
|
||||
These settings may come off as self-explanatory for some, but here's
|
||||
what they do:
|
||||
|
||||
- `src` This is obviously your source clip.
|
||||
|
||||
- `range` This specifies the range of pixels that are used to
|
||||
calculate whether something is banded. A higher range means more
|
||||
pixels are used for calculation, meaning it requires more processing
|
||||
power. The default of 15 should usually be fine.
|
||||
|
||||
- `y` The most important setting, since most (noticeable) banding
|
||||
takes place on the luma plane. It specifies how big the difference
|
||||
has to be for something on the luma plane to be considered as
|
||||
banded. You should start low and slowly but surely build this up
|
||||
until the banding is gone. If it's set too high, lots of details
|
||||
will be seen as banding and hence be blurred.
|
||||
|
||||
- `cb` and `cr` The same as `y` but for chroma. However, banding on
|
||||
the chroma planes is quite uncommon, so you can often leave this
|
||||
off.
|
||||
|
||||
- `grainy` and `grainc` In order to keep banding from re-occurring and
|
||||
to counteract smoothing, grain is usually added after the debanding
|
||||
process. However, as this fake grain is quite noticeable, it's
|
||||
recommended to be conservative. Alternatively, you can use a custom
|
||||
grainer, which will get you a far nicer output (see
|
||||
[3.2.10](#graining){reference-type="ref" reference="graining"}).
|
||||
|
||||
- `dynamic_grain` By default, grain added by `f3kdb` is static. This
|
||||
compresses better, since there's obviously less variation, but it
|
||||
usually looks off with live action content, so it's normally
|
||||
recommended to set this to `True` unless you're working with
|
||||
animated content.
|
||||
|
||||
- `sample_mode` Is explained in the README. Consider switching to 4, since it might have less detail loss.
|
||||
|
||||
- `output_depth` You should set this to whatever the bit depth you
|
||||
want to work in after debanding is. If you're working in 8 bit the
|
||||
entire time, you can just leave out this option.
|
||||
|
||||
Here's an example of very simple debanding:
|
||||
|
||||
![Source on left,
|
||||
`deband = core.f3kdb.Deband(src, y=64, cr=0, cb=0, grainy=32, grainc=0, range=15, keep_tv_range=True)`
|
||||
on right. Zoomed in with a factor of
|
||||
2.](Pictures/banding_before3.png){#fig:1}
|
||||
|
||||
![Source on left,
|
||||
`deband = core.f3kdb.Deband(src, y=64, cr=0, cb=0, grainy=32, grainc=0, range=15, keep_tv_range=True)`
|
||||
on right. Zoomed in with a factor of
|
||||
2.](Pictures/banding_after3.png){#fig:1}
|
||||
|
||||
It's recommended to use a mask along with this for debanding, specifically
|
||||
`retinex_edgemask` or `debandmask`
|
||||
(albeit the former is a lot slower). More on this in [the masking section](masking)\
|
||||
\
|
||||
The most commonly used alternative is `gradfun3`, which is likely mainly
|
||||
less popular due to its less straightforward parameters. It works by
|
||||
smoothing the source via a [bilateral filter](https://en.wikipedia.org/wiki/Bilateral_filter), limiting this via
|
||||
`mvf.LimitFilter` to the values specified by `thr` and `elast`, then
|
||||
merging with the source via its internal mask (although using an
|
||||
external mask is also possible). While it's possible to use it for
|
||||
descaling, most prefer to do so in another step.\
|
||||
Many people believe `gradfun3` produces smoother results than `f3kdb`.
|
||||
As it has more options than `f3kdb` and one doesn't have to bother with
|
||||
masks most of the time when using it, it's certainly worth knowing how
|
||||
to use:
|
||||
|
||||
import fvsfunc as fvf
|
||||
deband = fvf.GradFun3(src, thr=0.35, radius=12, elast=3.0, mask=2, mode=3, ampo=1, ampn=0, pat=32, dyn=False, staticnoise=False, smode=2, thr_det=2 + round(max(thr - 0.35, 0) / 0.3), debug=False, thrc=thr, radiusc=radius, elastc=elast, planes=list(range(src.format.num_planes)), ref=src, bits=src.format.bits_per_sample) # + resizing variables
|
||||
|
||||
Lots of these values are for `fmtconv` bit depth conversion, so its
|
||||
[documentation](https://github.com/EleonoreMizo/fmtconv/blob/master/doc/fmtconv.html) can prove to be helpful. Descaling in `GradFun3`
|
||||
isn't much different from other descalers, so I won't discuss this. Some
|
||||
of the other values that might be of interest are:
|
||||
|
||||
- `thr` is equivalent to `y`, `cb`, and `cr` in what it does. You'll
|
||||
likely want to raise or lower it.
|
||||
|
||||
- `radius` has the same effect as `f3kdb`'s `range`.
|
||||
|
||||
- `smode` sets the smooth mode. It's usually best left at its default,
|
||||
a bilateral filter[^19], or set to 5 if you'd like to use a
|
||||
CUDA-enabled GPU instead of your CPU. Uses `ref` (defaults to input
|
||||
clip) as a reference clip.
|
||||
|
||||
- `mask` disables the mask if set to 0. Otherwise, it sets the amount
|
||||
of `std.Maximum` and `std.Minimum` calls to be made.
|
||||
|
||||
- `planes` sets which planes should be processed.
|
||||
|
||||
- `mode` is the dither mode used by `fmtconv`.
|
||||
|
||||
- `ampn` and `staticnoise` set how much noise should be added by
|
||||
`fmtconv` and whether it should be static. Worth tinkering with for
|
||||
live action content. Note that these only do anything if you feed
|
||||
`GradFun3` a sub-16 bit depth clip.
|
||||
|
||||
- `debug` allows you to view the mask.
|
||||
|
||||
- `elast` is \"the elasticity of the soft threshold.\" Higher values
|
||||
will do more blending between the debanded and the source clip.
|
||||
|
||||
For a more in-depth explanation of what `thr` and `elast` do, check the
|
||||
algorithm explanation in [`mvsfunc`](https://github.com/HomeOfVapourSynthEvolution/mvsfunc/blob/master/mvsfunc.py#L1735).
|
||||
|
||||
The third debander worth considering is `placebo.Deband`. It uses the
|
||||
mpv debander, which just averages pixels within a range and outputs the
|
||||
average if the difference is below a threshold. The algorithm is
|
||||
explained in the [source code](https://github.com/haasn/libplacebo/blob/master/src/shaders/sampling.c#L167). As this function is fairly new and
|
||||
still has its quirks, it's best to refer to the [README](https://github.com/Lypheo/vs-placebo#placebodebandclip-clip-int-planes--1-int-iterations--1-float-threshold--40-float-radius--160-float-grain--60-int-dither--true-int-dither_algo--0) and ask
|
||||
experience encoders on IRC for advice.
|
||||
|
||||
If your debanded clip had very little grain compared to parts with no
|
||||
banding, you should consider using a separate function to add matched
|
||||
grain so the scenes blend together easier. If there was lots of grain,
|
||||
you might want to consider `adptvgrnMod`, `adaptive_grain` or
|
||||
`GrainFactory3`; for less obvious grain or simply for brighter scenes
|
||||
where there'd usually be very little grain, you can also use
|
||||
`grain.Add`. This topic will be further elaborated later in
|
||||
[the graining section](graining).\
|
||||
Here's an example from Mirai (full script later):
|
||||
|
||||
![Source on left, filtered on right. Banding might seem hard to spot
|
||||
here, but I can't include larger images in this PDF. The idea should be
|
||||
obvious, though.](Pictures/banding_graining.png){#fig:3}
|
||||
|
||||
If you want to automate your banding detection, you can use a detection
|
||||
function based on `bandmask`[^23] called `banddtct`[^24]. Make sure to
|
||||
adjust the values properly and check the full output. A forum post
|
||||
explaining it is linked in [the posts section](posts). You can also just run `adptvgrnMod` or
|
||||
`adaptive_grain` with a high `luma_scaling` value in hopes that the
|
||||
grain covers it up fully. More on this in
|
||||
[the graining section](graining).
|
17
src/filtering/deblocking.md
Normal file
|
@ -0,0 +1,17 @@
|
|||
Deblocking is mostly equivalent to smoothing the source, usually with
|
||||
another mask on top. The most popular function here is `Deblock_QED`
|
||||
from `havsfunc`. The main parameters are
|
||||
|
||||
- `quant1`: Strength of block edge deblocking. Default is 24. You may
|
||||
want to raise this value significantly.
|
||||
|
||||
- `quant2`: Strength of block internal deblocking. Default is 26.
|
||||
Again, raising this value may prove to be beneficial.
|
||||
|
||||
Other popular options are `deblock.Deblock`, which is quite strong, but
|
||||
almost always works,\
|
||||
`dfttest.DFTTest`, which is weaker, but still quite aggressive, and
|
||||
`fvf.AutoDeblock`, which is quite useful for deblocking MPEG-2 sources
|
||||
and can be applied on the entire video. Another popular method is to
|
||||
simply deband, as deblocking and debanding are very similar. This is a
|
||||
decent option for AVC Blu-ray sources.
|
48
src/filtering/dehaloing.md
Normal file
|
@ -0,0 +1,48 @@
|
|||
Haloing is a lot what it sounds like: thick, bright lines around edges.
|
||||
These are quite common with poorly resized content. You may also find
|
||||
that bad descaling or descaling of bad sources can produce noticeable
|
||||
haloing. To fix this, you should use either `havsfunc`'s `DeHalo_alpha`
|
||||
or its already masked counterpart, `FineDehalo`. If using the former,
|
||||
you'll *have* to write your own mask, as unmasked dehaloing usually
|
||||
leads to awful results. For a walkthrough of how to write a simple
|
||||
dehalo mask, check [encode.moe](encode.moe)'s guide[^37]. Note that
|
||||
`FineDehalo` was written for SD content and its mask might not work very
|
||||
well with higher resolutions, so it's worth considering writing your own
|
||||
mask and using `DeHalo_alpha` instead.
|
||||
|
||||
As `FineDehalo` is a wrapper around `DeHalo_alpha`, they share some
|
||||
parameters:
|
||||
|
||||
FineDehalo(src, rx=2.0, ry=None, thmi=80, thma=128, thlimi=50, thlima=100, darkstr=1.0, brightstr=1.0, showmask=0, contra=0.0, excl=True, edgeproc=0.0) # ry defaults to rx
|
||||
DeHalo_alpha(clp, rx=2.0, ry=2.0, darkstr=1.0, brightstr=1.0, lowsens=50, highsens=50, ss=1.5)
|
||||
|
||||
The explanations on the AviSynth wiki are good enough:
|
||||
<http://avisynth.nl/index.php/DeHalo_alpha#Syntax_and_Parameters> and
|
||||
<http://avisynth.nl/index.php/FineDehalo#Syntax_and_Parameters>.
|
||||
|
||||
`DeHalo_alpha` works by downscaling the source according to `rx` and
|
||||
`ry` with a mitchell bicubic ($b=\nicefrac{1}{3},\ c=\nicefrac{1}{3}$)
|
||||
kernel, scaling back to source resolution with blurred bicubic, and
|
||||
checking the difference between a minimum and maximum (check
|
||||
[3.2.14](#masking){reference-type="ref" reference="masking"} if you
|
||||
don't know what this means) for both the source and resized clip. The
|
||||
result is then evaluated to a mask according to the following
|
||||
expressions, where $y$ is the maximum and minimum call that works on the
|
||||
source, $x$ is the resized source with maximum and minimum, and
|
||||
everything is scaled to 8-bit:
|
||||
$$\texttt{mask} = \frac{y - x}{y + 0.0001} \times \left[255 - \texttt{lowsens} \times \left(\frac{y + 256}{512} + \frac{\texttt{highsens}}{100}\right)\right]$$
|
||||
This mask is used to merge the source back into the resized source. Now,
|
||||
the smaller value of each pixel is taken for a lanczos resize to
|
||||
$(\texttt{height} \times \texttt{ss})\times(\texttt{width} \times \texttt{ss})$
|
||||
of the source and a maximum of the merged clip resized to the same
|
||||
resolution with a mitchell kernel. The result of this is evaluated along
|
||||
with the minimum of the merged clip resized to the aforementioned
|
||||
resolution with a mitchell kernel to find the minimum of each pixel in
|
||||
these two clips. This is then resized to the original resolution via a
|
||||
lanczos resize, and the result is merged into the source via the
|
||||
following:
|
||||
|
||||
if original < processed
|
||||
x - (x - y) * darkstr
|
||||
else
|
||||
x - (x - y) * brightstr
|
26
src/filtering/dehardsubbing.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
While this issue is particularly common with anime, it does also occur
|
||||
in some live action sources, and many music videos or concerts on played
|
||||
on TV stations with logos, hence it's worth looking at how to remove
|
||||
hardsubs or logos. For logos, the `Delogo` plugin is well worth
|
||||
considering. To use it, you're going to need the `.lgd` file of the
|
||||
logo. You can simply look for this via your favorite search engine and
|
||||
something should show up. From there, it should be fairly
|
||||
straightforward what to do with the plugin.
|
||||
|
||||
The most common way of removing hardsubs is to compare two sources, one
|
||||
with hardsubs and one reference source with no hardsubs. The functions
|
||||
I'd recommend for this are `hardsubmask` and `hardsubmask_fades` from
|
||||
`kagefunc`[^42]. The former is only useful for sources with black and
|
||||
white subtitles, while the latter can be used for logos as well as
|
||||
moving subtitles. Important parameters for both are the `expand`
|
||||
options, which imply `std.Maximum` calls. Depending on how good your
|
||||
sources are and how much gets detected, you may want to lower these
|
||||
values.
|
||||
|
||||
Once you have your mask ready, you'll want to merge in your reference
|
||||
hardsub-less source with the main source. You may want to combine this
|
||||
process with some tinting, as not all sources will have the same colors.
|
||||
It's important to note that doing this will yield far better results
|
||||
than swapping out a good source for a bad one. If you're lazy, these
|
||||
masks can usually be applied to the entire clip with no problem, so you
|
||||
won't have to go through the entire video looking for hardsubbed areas.
|
23
src/filtering/denoising.md
Normal file
|
@ -0,0 +1,23 @@
|
|||
Denoising is a rather touchy subject. Live action encoders will never
|
||||
denoise, while anime encoders will often denoise too much. The main
|
||||
reason you'll want to do this for anime is that it shouldn't have any on
|
||||
its own, but compression introduces noise, and bit depth conversion
|
||||
introduces dither. The former is unwanted, while the latter is wanted.
|
||||
You might also encounter intentional grain in things like flashbacks.
|
||||
Removing unwanted noise will aid in compression and kill some slight
|
||||
dither/grain; this is useful for 10-bit, since smoother sources simply
|
||||
encode better and get you great results, while 8-bit is nicer with more
|
||||
grain to keep banding from appearing etc. However, sometimes, you might
|
||||
encounter cases where you'll have to denoise/degrain for reasons other
|
||||
than compression. For example, let's say you're encoding an anime movie
|
||||
in which there's a flashback scene to one of the original anime
|
||||
episodes. Anime movies are often 1080p productions, but most series
|
||||
aren't. So, you might encounter an upscale with lots of 1080p grain on
|
||||
it. In this case, you'll want to degrain, rescale, and merge the grain
|
||||
back[^38]:
|
||||
|
||||
degrained = core.knlm.KNLMeansCL(src, a=1, h=1.5, d=3, s=0, channels="Y", device_type="gpu", device_id=0)
|
||||
descaled = fvf.Debilinear(degrained, 1280, 720)
|
||||
upscaled = nnedi3_rpow2(descaled, rfactor=2).resize.Spline36(1920, 1080).std.Merge(src, [0,1])
|
||||
diff = core.std.MakeDiff(src, degrained, planes=[0])
|
||||
merged = core.std.MergeDiff(upscaled, diff, planes=[0])
|
17
src/filtering/deringing.md
Normal file
|
@ -0,0 +1,17 @@
|
|||
The term \"ringing\" can refer to a lot of edge artifacts, with the most
|
||||
common ones being mosquito noise and edge enhancement artifacts. Ringing
|
||||
is something very common with low quality sources. However, due to poor
|
||||
equipment and atrocious compression methods, even high bitrate concerts
|
||||
are prone to this. To fix this, it's recommended to use something like
|
||||
`HQDeringmod` or `EdgeCleaner` (from `scoll`), with the former being my
|
||||
recommendation. The rough idea behind these is to blur and sharpen
|
||||
edges, then merge via edgemasks. They're quite simple, so you can just
|
||||
read through them yourself and you should get a decent idea of what
|
||||
they'll do. As `rgvs.Repair` can be quite aggressive, I'd recommend
|
||||
playing around with the repair values if you use one of these functions
|
||||
and the defaults don't produce decent enough results.
|
||||
|
||||
![`HQDeringmod(mrad=5, msmooth=10, drrep=0)` on the right, source on the
|
||||
left. This is *very* aggressive deringing that I usually wouldn't
|
||||
recommend. The source image is from a One Ok Rock concert Blu-ray with
|
||||
37 mbps video.](Pictures/dering.png){#fig:17}
|
150
src/filtering/descaling.md
Normal file
|
@ -0,0 +1,150 @@
|
|||
While most movies are produced at 2K resolution and most anime are made
|
||||
at 720p, Blu-rays are almost always 1080p and UHD Blu-rays are all 4K.
|
||||
This means the mastering house often has to upscale footage. These
|
||||
upscales are pretty much always terrible, but luckily, some are
|
||||
reversible. Since anime is usually released at a higher resolution than
|
||||
the source images, and bilinear or bicubic upscales are very common,
|
||||
most descalers are written for anime, and it's the main place where
|
||||
you'll need to descale. Live action content usually can't be descaled
|
||||
because of bad proprietary scalers (often QTEC or the likes), hence most
|
||||
live action encoders don't know or consider descaling.
|
||||
|
||||
So, if you're encoding anime, always make sure to check what the source
|
||||
images are. You can use <https://anibin.blogspot.com/> for this, run
|
||||
screenshots through `getnative`[^36], or simply try it out yourself. The
|
||||
last option is obviously the best way to go about this, but getnative is
|
||||
usually very good, too, and is a lot easier. Anibin, while also useful,
|
||||
won't always get you the correct resolution.
|
||||
|
||||
In order to perform a descale, you should be using `fvsfunc`:
|
||||
|
||||
import fvsfunc as fvf
|
||||
|
||||
descaled = fvf.Debilinear(src, 1280, 720, yuv444=False)
|
||||
|
||||
In the above example, we're descaling a bilinear upscale to 720p and
|
||||
downscaling the chroma with `Spline36` to 360p. If you're encoding anime
|
||||
for a site/group that doesn't care about hardware compatibility, you'll
|
||||
probably want to turn on `yuv444` and change your encode settings
|
||||
accordingly.
|
||||
|
||||
`Descale` supports bilinear, bicubic, and spline upscale kernels. Each
|
||||
of these, apart from `Debilinear`, also has its own parameters. For
|
||||
`Debicubic`, these are:
|
||||
|
||||
- `b`: between 0 and 1, this is equivalent to the blur applied.
|
||||
|
||||
- `c`: also between 0 and 1, this is the sharpening applied.
|
||||
|
||||
The most common cases are `b=1/3` and `c=1/3`, which are the default
|
||||
values, `b=0` and `c=1`, which is oversharpened bicubic, and `b=1` and
|
||||
`c=0`, which is blurred bicubic. In between values are quite common,
|
||||
too, however.
|
||||
|
||||
Similarly, `Delanczos` has the `taps` option, and spline upscales can be
|
||||
reversed for `Spline36` upscales with `Despline36` and `Despline16` for
|
||||
`Spline16` upscales.
|
||||
|
||||
Once you've descaled, you might want to upscale back to the 1080p or
|
||||
2160p. If you don't have a GPU available, you can do this via `nnedi3`
|
||||
or more specifically, `edi3_rpow2` or `nnedi3_rpow2`:
|
||||
|
||||
from edi3_rpow2 import nnedi3_rpow2
|
||||
|
||||
descaled = fvf.Debilinear(src, 1280, 720)
|
||||
upscaled = nnedi3_rpow2(descaled, 2).resize.Spline36(1920, 1080)
|
||||
out = core.std.Merge(upscaled, src, [0, 1])
|
||||
|
||||
What we're doing here is descaling a bilinear upscale to 720p, then
|
||||
using `nnedi3` to upscale it to 1440p, downscaling that back to 1080p,
|
||||
then merging it with the source's chroma. Those with a GPU available
|
||||
should refer to the `FSRCNNX` example in the resizing
|
||||
section.[3.1.1](#resize){reference-type="ref" reference="resize"}\
|
||||
There are multiple reasons you might want to do this:
|
||||
|
||||
- Most people don't have a video player set up to properly upscale the
|
||||
footage.
|
||||
|
||||
- Those who aren't very informed often think higher resolution =
|
||||
better quality, hence 1080p is more popular.
|
||||
|
||||
- Lots of private trackers only allow 720p and 1080p footage. Maybe
|
||||
you don't want to hurt the chroma or the original resolution is in
|
||||
between (810p and 900p are very common) and you want to upscale to
|
||||
1080p instead of downscaling to 720p.
|
||||
|
||||
Another important thing to note is that credits and other text is often
|
||||
added after upscaling, hence you need to use a mask to not ruin these.
|
||||
Luckily, you can simply add an `M` after the descale name
|
||||
(`DebilinearM`) and you'll get a mask applied. However, this
|
||||
significantly slows down the descale process, so you may want to
|
||||
scenefilter here.
|
||||
|
||||
On top of the aforementioned common descale methods, there are a few
|
||||
more filters worth considering, although they all do practically the
|
||||
same thing, which is downscaling line art (aka edges) and rescaling them
|
||||
to the source resolution. This is especially useful if lots of dither
|
||||
was added after upscaling.
|
||||
|
||||
- `DescaleAA`: part of `fvsfunc`, uses a `Prewitt` mask to find line
|
||||
art and rescales that.
|
||||
|
||||
- `InsaneAA`: Uses a strengthened `Sobel` mask and a mixture of both
|
||||
`eedi3` and `nnedi3`.
|
||||
|
||||
Personally, I don't like upscaling it back and would just stick with a
|
||||
YUV444 encode. If you'd like to do this, however, you can also consider
|
||||
trying to write your own mask. An example would be (going off the
|
||||
previous code):
|
||||
|
||||
mask = kgf.retinex_edgemask(src).std.Binarize(15000).std.Inflate()
|
||||
new_y = core.std.MaskedMerge(src, upscaled, mask)
|
||||
new_clip = core.std.ShufflePlanes([new_y, u, v], [0, 0, 0], vs.YUV)
|
||||
|
||||
Beware with descaling, however, that using incorrect resolutions or
|
||||
kernels will only intensify issues brought about by bad upscaling, such
|
||||
as ringing and aliasing. It's for this reason that you need to be sure
|
||||
you're using the correct descale, so always manually double-check your
|
||||
descale. You can do this via something like the following code:
|
||||
|
||||
import fvsfunc as fvf
|
||||
from vsutil import join, split
|
||||
y, u, v = split(source) # this splits the source planes into separate clips
|
||||
descale = fvf.Debilinear(y, 1280, 720)
|
||||
rescale = descale.resize.Bilinear(1920, 1080)
|
||||
merge = join([y, u, v])
|
||||
out = core.std.Interleave([source, merge])
|
||||
|
||||
If you can't determine the correct kernel and resolution, just downscale
|
||||
with a normal `Spline36` resize. It's usually easiest to determine
|
||||
source resolution and kernels in brighter frames with lots of blur-free
|
||||
details.
|
||||
|
||||
In order to illustrate the difference, here are examples of rescaled
|
||||
footage. Do note that YUV444 downscales scaled back up by a video player
|
||||
will look better.
|
||||
|
||||
![Source Blu-ray with 720p footage upscaled to 1080p via a bicubic
|
||||
filter on the left, rescale with `Debicubic` and `nnedi3` on the
|
||||
right.](Pictures/bicubic.png){#fig:6}
|
||||
|
||||
It's important to note that this is certainly possible with live action
|
||||
footage as well. An example would be the Game of Thrones season 1 UHD
|
||||
Blu-rays, which are bilinear upscales. While not as noticeable in
|
||||
screenshots, the difference is stunning during playback.
|
||||
|
||||
![Source UHD Blu-ray with 1080p footage upscaled to 2160p via a bilinear
|
||||
filter on the left, rescale with `Debilinear` and `nnedi3` on the
|
||||
right.](Pictures/bilinear_before2.png){#fig:7}
|
||||
|
||||
![Source UHD Blu-ray with 1080p footage upscaled to 2160p via a bilinear
|
||||
filter on the left, rescale with `Debilinear` and `nnedi3` on the
|
||||
right.](Pictures/bilinear_after2.png){#fig:7}
|
||||
|
||||
If your video seems to have multiple source resolutions in every frame
|
||||
(i.. different layers are in different resolutions), which you can
|
||||
notice by `getnative` outputting multiple results, your best bet is to
|
||||
downscale to the lowest resolution via `Spline36`. While you technically
|
||||
can mask each layer to descale all of them to their source resolution,
|
||||
then scale each one back up, this is far too much effort for it to be
|
||||
worth it.
|
49
src/filtering/detinting.md
Normal file
|
@ -0,0 +1,49 @@
|
|||
If you've got a better source with a tint and a worse source without a
|
||||
tint, and you'd like to remove it, you can do so via `timecube` and
|
||||
DrDre's Color Matching Tool[^41]. First, add two reference screenshots
|
||||
to the tool, export the LUT, save it, and add it via something like:
|
||||
|
||||
clip = core.resize.Point(src, matrix_in_s="709", format=vs.RGBS)
|
||||
detint = core.timecube.Cube(clip, "LUT.cube")
|
||||
out = core.resize.Point(detint, matrix=1, format=vs.YUV420P16, dither_type="error_diffusion")
|
||||
|
||||
![Source with tint on left, tint removed on right. This example is from
|
||||
the D-Z0N3 encode of Your Name (2016). Some anti-aliasing was also
|
||||
performed on this frame.](Pictures/detint.png){#fig:8}
|
||||
|
||||
Similarly, if you have what's known as a gamma bug, or more precisely,
|
||||
double range compression (applying full to limited range compression to
|
||||
an already limited range clip), just do the following (for 16-bit):
|
||||
|
||||
out = core.std.Levels(src, gamma=0.88, min_in=4096, max_in=60160, min_out=4096, max_out=60160, planes=0)
|
||||
|
||||
![Double range compression on left, gamma bug fix on
|
||||
right.](Pictures/gamma.png){#fig:9}
|
||||
|
||||
0.88 is usually going to be the required value, but it's not unheard of
|
||||
to have to apply different gamma values. This is necessary if blacks
|
||||
have a luma value of 218 instead of 235. Do not perform this operation
|
||||
in low bit depth. The reasoning for this can be seen in figure
|
||||
[20](#fig:flvls){reference-type="ref" reference="fig:flvls"}. If the
|
||||
chroma planes are also affected, you'll have to deal with them
|
||||
separately:
|
||||
|
||||
out = core.std.Levels(src, gamma=0.88, min_in=4096, max_in=61440, min_out=4096, max_out=61440, planes=[1, 2])
|
||||
|
||||
You can also use the `fixlvls` wrapper in `awsmfunc` to do all these
|
||||
operations.\
|
||||
\
|
||||
If you have a source with an improper color matrix, you can fix this
|
||||
with the following:
|
||||
|
||||
out = core.resize.Point(src, matrix_in_s='470bg', matrix_s='709')
|
||||
|
||||
The `’470bg’` is what's also known as 601. To know if you should be
|
||||
doing this, you'll need some reference sources, preferably not web
|
||||
sources. Technically, you can identify bad colors and realize that it's
|
||||
necessary to change the matrix.
|
||||
|
||||
![Example of matrix conversion from Burning (2018) with source and fix
|
||||
interleaved. Used a frame with lots of red, as this is the color where
|
||||
the difference is most noticeable.](Pictures/matrix_burning.png){#fig:27
|
||||
width="100%"}
|
358
src/filtering/dirty_lines.md
Normal file
|
@ -0,0 +1,358 @@
|
|||
Another very common issue, at least with live action content, is dirty
|
||||
lines. These are usually found on the borders of video, where a row or
|
||||
column of pixels exhibits usually a too low luma value compared to its
|
||||
surrounding rows. Oftentimes, this is the due to improper downscaling,
|
||||
more notably downscaling after applying borders. Dirty lines can also
|
||||
occur because video editors often won't know that while they're working
|
||||
in YUV422, meaning their height doesn't have to be mod2, consumer
|
||||
products will be YUV420, meaning the height has to be mod2, leading to
|
||||
extra black rows.\
|
||||
Another form of dirty lines is exhibited when the chroma planes are
|
||||
present on black bars. Usually, these should be cropped out. The
|
||||
opposite can also occur, however, where the planes with legitimate luma
|
||||
information lack chroma information.\
|
||||
To illustrate what dirty lines might look like, here's an example of
|
||||
`ContinuityFixer` and chroma-only `FillBorders`:\
|
||||
|
||||
![Source vs filtered of a dirty line fix from the D-Z0N3 encode of A
|
||||
Silent Voice. Used `ContinuityFixer` on top three rows and `FillBorders`
|
||||
on the two leftmost columns. Zoomed in with a factor of
|
||||
15.](Pictures/dirt.png){#fig:4}
|
||||
|
||||
There are six commonly used filters for fixing dirty lines:
|
||||
|
||||
- [`rekt`](https://gitlab.com/Ututu/rekt)'s `rektlvls`\
|
||||
This is basically `FixBrightnessProtect3` and `FixBrightness` in one
|
||||
with the additional fact that not the entire frame is processed. Its
|
||||
values are quite straightforward. Raise the adjustment values to
|
||||
brighten, lower to darken. Set `prot_val` to `None` and it will
|
||||
function like `FixBrightness`, meaning the adjustment values will
|
||||
need to be changed.
|
||||
```py
|
||||
from rekt import rektlvls
|
||||
fix = rektlvls(src, rownum=None, rowval=None, colnum=None, colval=None, prot_val=[16, 235])
|
||||
```
|
||||
|
||||
If you'd like to process multiple rows at a time, you can enter a
|
||||
list (e.g. `rownum=[0, 1, 2]`).\
|
||||
In `FixBrightness` mode, this will perform an adjustment with
|
||||
[`std.Levels`](www.vapoursynth.com/doc/functions/levels.html) on the desired row. This means that, in 8-bit,
|
||||
every possible value $v$ is mapped to a new value according to the
|
||||
following function: $$\begin{aligned}
|
||||
&\forall v \leq 255, v\in\mathbb{N}: \\
|
||||
&\max\bigg[\min\bigg(\frac{\max(\min(v, \texttt{max\_in}) - \texttt{min\_in}, 0)}{(\texttt{max\_in} - \texttt{min\_in})}\times (\texttt{max\_out} - \texttt{min\_out}) + \texttt{min\_out}, \\
|
||||
& 255), 0] + 0.5\end{aligned}$$ For positive `adj_val`,
|
||||
$\texttt{max\_in}=235 - \texttt{adj\_val}$. For negative ones,
|
||||
$\texttt{max\_out}=235 + \texttt{adj\_val}$. The rest of the values
|
||||
stay at 16 or 235 depending on whether they are maximums or
|
||||
minimums.\
|
||||
`FixBrightnessProtect3` mode takes this a bit further, performing
|
||||
(almost) the same adjustment for values between the first
|
||||
$\texttt{prot\_val} + 10$ and the second $\texttt{prot\_val} - 10$,
|
||||
where it scales linearly. Its adjustment value does not work the
|
||||
same, however, so you have to play around with it.
|
||||
|
||||
To illustrate this, let's look at the dirty lines in the black and
|
||||
white Blu-ray of Parasite (2019)'s bottom rows:
|
||||
|
||||
![Parasite b&w source, zoomed via point
|
||||
resizing.](Pictures/rektlvls_src.png){width=".9\\textwidth"}
|
||||
|
||||
In this example, the bottom four rows have alternating brightness
|
||||
offsets from the next two rows. So, we can use `rektlvls` to raise
|
||||
luma in the first and third row from the bottom, and again to lower
|
||||
it in the second and fourth:
|
||||
```py
|
||||
fix = rektlvls(src, rownum=[803, 802, 801, 800], rowval=[27, -10, 3, -3])
|
||||
```
|
||||
|
||||
In this case, we are in `FixBrightnessProtect3` mode. We aren't
|
||||
taking advantage of `prot_val` here, but people usually use this
|
||||
mode regardless, as there's always a chance it might help. The
|
||||
result:
|
||||
|
||||
![Parasite b&w source with `rektlvls` applied, zoomed via point
|
||||
resizing.](Pictures/rektlvls_fix.png){width=".9\\textwidth"}
|
||||
|
||||
- `awsmfunc`'s `bbmod`\
|
||||
This is a mod of the original BalanceBorders function. While it
|
||||
doesn't preserve original data nearly as well as `rektlvls`, ti will
|
||||
lead to decent results with high `blur` and `thresh` values and is
|
||||
easy to use for multiple rows, especially ones with varying
|
||||
brightness, where `rektlvls` is no longer useful. If it doesn't
|
||||
produce decent results, these can be changed, but the function will
|
||||
get more destructive the lower you set the `blur` value. It's also
|
||||
significantly faster than the versions in `havsfunc` and `sgvsfunc`
|
||||
as only necessary pixels are processed.\
|
||||
|
||||
import awsmfunc as awf
|
||||
bb = awf.bbmod(src=clip, left=0, right=0, top=0, bottom=0, thresh=[128, 128, 128], blur=[20, 20, 20], scale_thresh=False, cpass2=False)
|
||||
|
||||
The arrays for `thresh` and `blur` are again y, u, and v values.
|
||||
It's recommended to try `blur=999` first, then lowering that and
|
||||
`thresh` until you get decent values.\
|
||||
`thresh` specifies how far the result can vary from the input. This
|
||||
means that the lower this is, the better. `blur` is the strength of
|
||||
the filter, with lower values being stronger, and larger values
|
||||
being less aggressive. If you set `blur=1`, you're basically copying
|
||||
rows. If you're having trouble with chroma, you can try activating
|
||||
`cpass2`, but note that this requires a very low `thresh` to be set,
|
||||
as this changes the chroma processing significantly, making it quite
|
||||
aggressive\
|
||||
`bbmod` works by blurring the desired rows, input rows, and
|
||||
reference rows within the image using a blurred bicubic kernel,
|
||||
whereby the blur amount determines the resolution scaled to accord
|
||||
to $\mathtt{\frac{width}{blur}}$. The output is compared using
|
||||
expressions and finally merged according to the threshold specified.
|
||||
|
||||
For our example, let's again use Parasite (2019), but the SDR UHD
|
||||
this time. It has irregular dirty lines on the top three rows:
|
||||
|
||||
![Parasite SDR UHD source, zoomed via point
|
||||
resizing.](Pictures/bbmod_src.png){width=".9\\textwidth"}
|
||||
|
||||
To fix this, we can apply `bbmod` with a low blur and a low thresh,
|
||||
meaning we won't change pixels' values by much:
|
||||
|
||||
```py
|
||||
fix = awf.bbmod(src, top=3, thresh=20, blur=20)
|
||||
```
|
||||
|
||||
![Parasite SDR UHD source with `bbmod` applied, zoomed via point
|
||||
resizing.](Pictures/bbmod_fix.png){width=".9\\textwidth"}
|
||||
|
||||
Our output is already a lot closer to what we assume the source
|
||||
should look like. Unlike `rektlvls`, this function is quite quick to
|
||||
use, so lazy people (a.. everyone) can use this to fix dirty lines
|
||||
before resizing, as the difference won't be noticeable after
|
||||
resizing.
|
||||
|
||||
- [`fb`'s](https://github.com/Moiman/vapoursynth-fillborders) `FillBorders`\
|
||||
This function pretty much just copies the next column/row in line.
|
||||
While this sounds, silly, it can be quite useful when downscaling
|
||||
leads to more rows being at the bottom than at the top, and one
|
||||
having to fill one up due to YUV420's mod2 height.
|
||||
|
||||
```py
|
||||
fill = core.fb.FillBorders(src=clip, left=0, right=0, bottom=0, top=0, mode="fillmargins")
|
||||
```
|
||||
|
||||
A very interesting use for this function is one similar to applying
|
||||
`ContinuityFixer` only to chroma planes, which can be used on gray
|
||||
borders or borders that don't match their surroundings no matter
|
||||
what luma fix is applied. This can be done with the following
|
||||
script:
|
||||
|
||||
```py
|
||||
fill = core.fb.FillBorders(src=clip, left=0, right=0, bottom=0, top=0, mode="fillmargins")
|
||||
merge = core.std.Merge(clipa=clip, clipb=fill, weight=[0,1])
|
||||
```
|
||||
|
||||
You can also split the planes and process the chroma planes
|
||||
individually, although this is only slightly faster. A wrapper that
|
||||
allows you to specify per-plane values for `fb` is `FillBorders` in
|
||||
`awsmfunc`.\
|
||||
`FillBorders` in `fillmargins` mode works by averaging the previous
|
||||
row's pixels; for each pixel, it takes $3\times$ the left pixel of
|
||||
the previous row, $2\times$ the middle pixel of the previous row,
|
||||
and $3\times$ the right pixel of the previous row, then averaging
|
||||
the output.
|
||||
|
||||
To illustrate what a source requiring `FillBorders` might look like,
|
||||
let's look at Parasite (2019)'s SDR UHD once again, which requires
|
||||
an uneven crop of 277. However, we can't crop this due to chroma
|
||||
subsampling, so we need to fill one row. To illustrate this, we'll
|
||||
only be looking at the top rows. Cropping with respect to chroma
|
||||
subsampling nets us:
|
||||
|
||||
```py
|
||||
crp = src.std.Crop(top=276)
|
||||
```
|
||||
|
||||
![Parasite source cropped while respecting chroma subsampling,
|
||||
zoomed via point
|
||||
resizing.](Pictures/fb_src.png){width=".9\\textwidth"}
|
||||
|
||||
Obviously, we want to get rid of the black line at the top, so let's
|
||||
use `FillBorders` on it:
|
||||
|
||||
```py
|
||||
fil = crp.fb.FillBorders(top=1, mode="fillmargins")
|
||||
```
|
||||
|
||||
![Parasite source cropped while respecting chroma subsampling and
|
||||
luma fixed via `FillBorders`, zoomed via point
|
||||
resizing.](Pictures/fb_luma.png){width=".9\\textwidth"}
|
||||
|
||||
This already looks better, but the orange tones look washed out.
|
||||
This is because `FillBorders` only fills one chroma if **two** luma
|
||||
are fixed. So, we need to fill chroma as well. To make this easier
|
||||
to write, let's use the `awsmfunc` wrapper:
|
||||
|
||||
```py
|
||||
fil = awf.fb(crp, top=1)
|
||||
```
|
||||
|
||||
![Parasite source cropped while respecting chroma subsampling and
|
||||
luma and chroma fixed via `FillBorders`, zoomed via point
|
||||
resizing.](Pictures/fb_lumachroma.png){width=".9\\textwidth"}
|
||||
|
||||
Our source is now fixed. Some people may want to resize the chroma
|
||||
to maintain original aspect ratio while shifting chroma, but whether
|
||||
this is the way to go is not generally agreed upon (personally, I,
|
||||
Aicha, disagree with doing this). If you want to go this route:
|
||||
|
||||
```py
|
||||
top = 1
|
||||
bot = 1
|
||||
new_height = crp.height - (top + bot)
|
||||
fil = awf.fb(crp, top=top, bottom=bot)
|
||||
out = fil.resize.Spline36(crp.width, new_height, src_height=new_height, src_top=top)
|
||||
```
|
||||
|
||||
- [`cf`'s](https://gitlab.com/Ututu/VS-ContinuityFixer) `ContinuityFixer`\
|
||||
`ContinuityFixer` works by comparing the rows/columns specified to
|
||||
the amount of rows/columns specified by `range` around it and
|
||||
finding new values via least squares regression. Results are similar
|
||||
to `bbmod`, but it creates entirely fake data, so it's preferable to
|
||||
use `rektlvls` or `bbmod` with a high blur instead. Its settings
|
||||
look as follows:
|
||||
|
||||
```py
|
||||
fix = core.cf.ContinuityFixer(src=clip, left=[0, 0, 0], right=[0, 0, 0], top=[0, 0, 0], bottom=[0, 0, 0], radius=1920)
|
||||
```
|
||||
|
||||
This is assuming you're working with 1080p footage, as `radius`'s
|
||||
value is set to the longest set possible as defined by the source's
|
||||
resolution. I'd recommend a lower value, although not going much
|
||||
lower than $3$, as at that point, you may as well be copying pixels
|
||||
(see `FillBorders` below for that). What will probably throw off
|
||||
most newcomers is the array I've entered as the values for
|
||||
rows/columns to be fixed. These denote the values to be applied to
|
||||
the three planes. Usually, dirty lines will only occur on the luma
|
||||
plane, so you can often leave the other two at a value of 0. Do note
|
||||
an array is not necessary, so you can also just enter the amount of
|
||||
rows/columns you'd like the fix to be applied to, and all planes
|
||||
will be processed.\
|
||||
`ContinuityFixer` works by calculating the least squares
|
||||
regression[^29] of the pixels within the radius. As such, it creates
|
||||
entirely fake data based on the image's likely edges.\
|
||||
One thing `ContinuityFixer` is quite good at is getting rid of
|
||||
irregularities such as dots. It's also faster than `bbmod`, but it
|
||||
should be considered a backup option.
|
||||
|
||||
Let's look at the `bbmod` example again and apply `ContinuityFixer`:
|
||||
|
||||
```py
|
||||
fix = src.cf.ContinuityFixer(top=[3, 0, 0], radius=6)
|
||||
```
|
||||
|
||||
![Parasite SDR UHD source with `ContinuityFixer` applied, zoomed via
|
||||
point
|
||||
resizing.](Pictures/continuityfixer.png){width=".9\\textwidth"}
|
||||
|
||||
Let's compare the second, third, and fourth row for each of these:
|
||||
|
||||
![Comparison of Parasite SDR UHD source, `bbmod`, and
|
||||
`ContinuityFixer`](Pictures/cfx_bbm.png){width=".9\\textwidth"}
|
||||
|
||||
The result is ever so slightly in favor of `ContinuityFixer` here.
|
||||
|
||||
- `edgefixer`'s[^30] `ReferenceFixer`\
|
||||
This requires the original version of `edgefixer` (`cf` is just an
|
||||
old port of it, but it's nicer to use and processing hasn't
|
||||
changed). I've never found a good use for it, but in theory, it's
|
||||
quite neat. It compares with a reference clip to adjust its edge fix
|
||||
as in `ContinuityFixer`.:
|
||||
|
||||
```py
|
||||
fix = core.edgefixer.Reference(src, ref, left=0, right=0, top=0, bottom=0, radius = 1920)
|
||||
```
|
||||
|
||||
One thing that shouldn't be ignored is that applying these fixes (other
|
||||
than `rektlvls`) to too many rows/columns may lead to these looking
|
||||
blurry on the end result. Because of this, it's recommended to use
|
||||
`rektlvls` whenever possible or carefully apply light fixes to only the
|
||||
necessary rows. If this fails, it's better to try `bbmod` before using
|
||||
`ContinuityFixer`.
|
||||
|
||||
It's important to note that you should *always* fix dirty lines before
|
||||
resizing, as not doing so will introduce even more dirty lines. However,
|
||||
it is important to note that, if you have a single black line at an edge
|
||||
that you would use `FillBorders` on, you should remove that using your
|
||||
resizer.\
|
||||
|
||||
For example, to resize a clip with a single filled line at the top to
|
||||
$1280\times536$ from $1920\times1080$:
|
||||
|
||||
```py
|
||||
top_crop = 138
|
||||
bot_crop = 138
|
||||
top_fill = 1
|
||||
bot_fill = 0
|
||||
src_height = src.height - (top_crop + bot_crop) - (top_fill + bot_fill)
|
||||
crop = core.std.Crop(src, top=top_crop, bottom=bot_crop)
|
||||
fix = core.fb.FillBorders(crop, top=top_fill, bottom=bot_fill, mode="fillmargins")
|
||||
resize = core.resize.Spline36(1280, 536, src_top=top_fill, src_height=src_height)
|
||||
```
|
||||
|
||||
If you're dealing with diagonal borders, the proper approach here is to
|
||||
mask the border area and merge the source with a `FillBorders` call. An
|
||||
example of this (from the D-Z0N3 encode of Your Name (2016)):
|
||||
|
||||
![Example of improper borders from Your Name with brightness lowered.
|
||||
D-Z0N3 is masked, Geek is unmasked. As such, Geek lacks any resemblance
|
||||
of grain, while D-Z0N3 keeps it in tact whenever possible. It may have
|
||||
been smarter to use the `mirror` mode in `FillBorders`, but hindsight is
|
||||
20/20.](Pictures/improper_borders.png){#fig:25 width="100%"}
|
||||
|
||||
Code used by D-Z0N3 (in 16-bit):
|
||||
|
||||
```py
|
||||
mask = core.std.ShufflePlanes(src, 0, vs.GRAY).std.Binarize(43500)
|
||||
cf = core.fb.FillBorders(src, top=6).std.MaskedMerge(src, mask)
|
||||
```
|
||||
|
||||
Another example of why you should be masking this is in the appendix
|
||||
under figure [\[fig:26\]](#fig:26){reference-type="ref"
|
||||
reference="fig:26"}.
|
||||
|
||||
Dirty lines can be quite difficult to spot. If you don't immediately
|
||||
spot any upon examining borders on random frames, chances are you'll be
|
||||
fine. If you know there are frames with small black borders on each
|
||||
side, you can use something like the following script[^31]:
|
||||
|
||||
```py
|
||||
def black_detect(clip, thresh=None):
|
||||
if thresh:
|
||||
clip = core.std.ShufflePlanes(clip, 0, vs.GRAY).std.Binarize(
|
||||
"{0}".format(thresh)).std.Invert().std.Maximum().std.Inflate( ).std.Maximum().std.Inflate()
|
||||
l = core.std.Crop(clip, right=clip.width / 2)
|
||||
r = core.std.Crop(clip, left=clip.width / 2)
|
||||
clip = core.std.StackHorizontal([r, l])
|
||||
t = core.std.Crop(clip, top=clip.height / 2)
|
||||
b = core.std.Crop(clip, bottom=clip.height / 2)
|
||||
return core.std.StackVertical([t, b])
|
||||
```
|
||||
|
||||
This script will make values under the threshold value (i.. the black
|
||||
borders) show up as vertical or horizontal white lines in the middle on
|
||||
a mostly black background. If no threshold is given, it will simply
|
||||
center the edges of the clip. You can just skim through your video with
|
||||
this active. An automated script would be `dirtdtct`[^32], which scans
|
||||
the video for you.
|
||||
|
||||
Other kinds of variable dirty lines are a bitch to fix and require
|
||||
checking scenes manually.
|
||||
|
||||
An issue very similar to dirty lines is bad borders. During scenes with
|
||||
different crops (e.. IMAX or 4:3), the black borders may sometimes not
|
||||
be entirely black, or be completely messed up. In order to fix this,
|
||||
simply crop them and add them back. You may also want to fix dirty lines
|
||||
that may have occurred along the way:
|
||||
|
||||
```py
|
||||
crop = core.std.Crop(src, left=100, right=100)
|
||||
clean = core.cf.ContinuityFixer(crop, left=2, right=2, top=0, bottom=0, radius=25)
|
||||
out = core.std.AddBorders(clean, left=100, right=100)
|
||||
```
|
54
src/filtering/graining.md
Normal file
|
@ -0,0 +1,54 @@
|
|||
As grain and dither are some of the hardest things to compress, many
|
||||
sources will feature very little of this or obviously destroyed grain.
|
||||
To counteract this or simply to aid with compression of areas with no
|
||||
grain, it's often beneficial to manually add grain. In this case of
|
||||
destroyed grain, you will usually want to remove the grain first before
|
||||
re-applying it. This is especially beneficial with anime, as a lack of
|
||||
grain can often make it harder for the encoder to maintain gradients.
|
||||
|
||||
As we're manually applying grain, we have the option to opt for static
|
||||
grain. This is almost never noticeable with anime, and compresses a lot
|
||||
better, hence it's usually the best option for animated content. It is,
|
||||
however, often quite noticeable in live action content, hence static
|
||||
grain is not often used in private tracker encodes.
|
||||
|
||||
The standard graining function, which the other functions also use, is
|
||||
`grain.Add`:
|
||||
|
||||
grained = core.grain.Add(clip, var=1, constant=False)
|
||||
|
||||
The `var` option here signifies the strength. You probably won't want to
|
||||
raise this too high. If you find yourself raising it too high, it'll
|
||||
become noticeable enough to the point where you're better off attempting
|
||||
to match the grain in order to keep the grain unnoticeable.
|
||||
|
||||
The most well-known function for adding grain is `GrainFactory3`. This
|
||||
function allows you to specify how `grain.Add` should be applied for
|
||||
three different luma levels (bright, medium, dark). It also scales the
|
||||
luma with `resize.Bicubic` in order to raise or lower its size, as well
|
||||
as sharpen it via functions `b` and `c` parameters, which are modified
|
||||
via the `sharpen` option. It can be quite hard to match here, as you
|
||||
have to modify size, sharpness, and threshold parameters. However, it
|
||||
can produce fantastic results, especially for live action content with
|
||||
more natural grain.
|
||||
|
||||
A more automated option is `adaptive_grain`[^39]. This works similarly
|
||||
to `GrainFactory3`, but instead applies variable amounts of grain to
|
||||
parts of the frame depending on the overall frame's luma value and
|
||||
specific areas' luma values. As you have less options, it's easier to
|
||||
use, and it works fine for anime. The dependency on the overall frame's
|
||||
average brightness also makes it produce very nice results.
|
||||
|
||||
In addition to these two functions, a combination of the two called
|
||||
`adptvgrnMod`[^40] is available. This adds the sharpness and size
|
||||
specification options from `GrainFactory3` to\
|
||||
`adaptive_grain`. As grain is only added to one (usually smaller than
|
||||
the frame) image in one size, this often ends up being the fastest
|
||||
function. If the grain size doesn't change for different luma levels, as
|
||||
is often the case with digitally produced grain, this can lead to better
|
||||
results than both of the aforementioned functions.
|
||||
|
||||
For those curious what this may look like, please refer to the debanding
|
||||
example from Mirai in figure [3](#fig:3){reference-type="ref"
|
||||
reference="fig:3"}, as `adptvgrnMod` was used for graining in that
|
||||
example.
|
393
src/filtering/masking.md
Normal file
|
@ -0,0 +1,393 @@
|
|||
Masking is a less straightforward topic. The idea is to limit the
|
||||
application of filters according to the source image's properties. A
|
||||
mask will typically be grayscale, whereby how much of the two clips in
|
||||
question are applied is determined by the mask's brightness. So, if you
|
||||
do
|
||||
|
||||
mask = mask_function(src)
|
||||
filtered = filter_function(src)
|
||||
merge = core.std.MaskedMerge(src, filtered, mask)
|
||||
|
||||
The `filtered` clip will be used for every completely white pixel in
|
||||
`mask`, and the `src` clip for every black pixel, with in-between values
|
||||
determining the ratio of which clip is applied. Typically, a mask will
|
||||
be constructed using one of the following three functions:
|
||||
|
||||
- `std.Binarize`: This simply separates pixels by whether they are
|
||||
above or below a threshold and sets them to black or white
|
||||
accordingly.
|
||||
|
||||
- `std.Expr`: Known to be a very complicated function. Applies logic
|
||||
via reverse Polish notation. If you don't know what this is, read up
|
||||
on Wikipedia. Some cool things you can do with this are make some
|
||||
pixels brighter while keeping others the same (instead of making
|
||||
them dark as you would with `std.Binarize`):
|
||||
`std.Expr("x 2000 > x 10 * x ?")`. This would multiply every value
|
||||
above 2000 by ten and leave the others be. One nice use case is for
|
||||
in between values:
|
||||
`std.Expr("x 10000 > x 15000 < and x {} = x 0 = ?".format(2**src.format.bits_per_sample - 1))`.\
|
||||
This makes every value between 10 000 and 15 000 the maximum value
|
||||
allowed by the bit depth and makes the rest zero, just like how a
|
||||
`std.Binarize` mask would. Many other functions can be performed via
|
||||
this.
|
||||
|
||||
- `std.Convolution`: In essence, apply a matrix to your pixels. The
|
||||
documentation explains it well, so just read that if you don't get
|
||||
it. Lots of masks are defined via convolution kernels. You can use
|
||||
this to do a whole lot of stuff. For example, if you want to average
|
||||
all the values surrounding a pixel, do
|
||||
`std.Convolution([1, 1, 1, 1, 0, 1, 1, 1, 1])`. To illustrate, let's
|
||||
say you have a pixel with the value $\mathbf{1}$ with the following
|
||||
$3\times3$ neighborhood:
|
||||
|
||||
\\[\begin{bmatrix}
|
||||
0 & 2 & 4 \\\\
|
||||
6 & \mathbf{1} & 8 \\\\
|
||||
6 & 4 & 2
|
||||
\end{bmatrix}\\]
|
||||
|
||||
Now, let's apply a convolution kernel:
|
||||
|
||||
\\[\begin{bmatrix}
|
||||
2 & 1 & 3 \\\\
|
||||
1 & 0 & 1 \\\\
|
||||
4 & 1 & 5
|
||||
\end{bmatrix}\\]
|
||||
|
||||
This will result in the pixel 1 becoming:
|
||||
\\[\frac{1}{18} \times (2 \times 0 + 1 \times 2 + 3 \times 4 + 1 \times 6 + 0 \times \mathbf{1} + 1 \times 8 + 4 \times 6 + 1 \times 4 + 5 \times 2) = \frac{74}{18} \approx 4\\]
|
||||
|
||||
So, let's say you want to perform what is commonly referred to as a
|
||||
simple \"luma mask\":
|
||||
|
||||
y = core.std.ShufflePlanes(src, 0, vs.GRAY)
|
||||
mask = core.std.Binarize(y, 5000)
|
||||
merge = core.std.MaskedMerge(filtered, src, mask)
|
||||
|
||||
In this case, I'm assuming we're working in 16-bit. What `std.Binarize`
|
||||
is doing here is making every value under 5000 the lowest and every
|
||||
value above 5000 the maximum value allowed by our bit depth. This means
|
||||
that every pixel above 5000 will be copied from the source clip.
|
||||
|
||||
Let's try this using a `filtered` clip which has every pixel's value
|
||||
multiplied by 8:\
|
||||
|
||||
![Binarize mask applied to luma with filtered clip being
|
||||
`std.Expr("x 8 *")`.](Pictures/luma_mask.png){width="100%"}
|
||||
|
||||
Simple binarize masks on luma are very straightforward and often do a
|
||||
good job of limiting a filter to the desired area, especially as dark
|
||||
areas are more prone to banding and blocking.
|
||||
|
||||
A more sophisticated version of this is `adaptive_grain` from earlier in
|
||||
this guide. It scales values from black to white based on both the
|
||||
pixel's luma value compared to the image's average luma value. A more
|
||||
in-depth explanation can be found on the creator's blog[^43]. We
|
||||
manipulate this mask using a `luma_scaling` parameter. Let's use a very
|
||||
high value of 500 here:
|
||||
|
||||
![`kgf.adaptive_grain(y, show_mask=True, luma_scaling=500)` mask applied
|
||||
to luma with filtered clip being
|
||||
`std.Expr("x 8 *")`.](Pictures/adg_mask.png){width="100%"}
|
||||
|
||||
Alternatively, we can use an `std.Expr` to merge the clips via the
|
||||
following logic:
|
||||
|
||||
if abs(src - filtered) <= 1000:
|
||||
return filtered
|
||||
elif abs(src - filtered) >= 30000:
|
||||
return src
|
||||
else:
|
||||
return src + (src - filtered) * (30000 - abs(src - filtered)) / 29000
|
||||
|
||||
This is almost the exact algorithm used in `mvsfunc.LimitFilter`, which
|
||||
`GradFun3` uses to apply its bilateral filter. In VapourSynth, this
|
||||
would be:
|
||||
|
||||
expr = core.std.Expr([src, filtered], "x y - abs 1000 > x y - abs 30000 > x x y - 30000 x y - abs - * 29000 / + x ? y ?")
|
||||
|
||||
![`LimitFilter` style expression to apply filter `std.Expr("x 8 *")` to
|
||||
source.](Pictures/expr_limit.png){width="100%"}
|
||||
|
||||
Now, let's move on to the third option: convolutions, or more
|
||||
interestingly for us, edge masks. Let's say you have a filter that
|
||||
smudges details in your clip, but you still want to apply it to
|
||||
detail-free areas. We can use the following convolutions to locate
|
||||
horizontal and vertical edges in the image:
|
||||
|
||||
\\[\begin{aligned}
|
||||
&\begin{bmatrix}
|
||||
1 & 0 & -1 \\\\
|
||||
2 & 0 & -2 \\\\
|
||||
1 & 0 & -1
|
||||
\end{bmatrix}
|
||||
&\begin{bmatrix}
|
||||
1 & 2 & 1 \\\\
|
||||
0 & 0 & 0 \\\\
|
||||
-1 & -2 & -1
|
||||
\end{bmatrix}\end{aligned}\\]
|
||||
|
||||
Combining these two is what is commonly referred to as a Sobel-type edge
|
||||
mask. It produces the following for our image of the lion:
|
||||
|
||||
![image](Pictures/sobel.png){width="100%"}
|
||||
|
||||
Now, this result is obviously rather boring. One can see a rough outline
|
||||
of the background and the top of the lion, but not much more can be made
|
||||
out.\
|
||||
To change this, let's introduce some new functions:
|
||||
|
||||
- `std.Maximum/Minimum`: Use this to grow or shrink your mask, you may
|
||||
additionally want to apply `coordinates=[0, 1, 2, 3, 4, 5, 6, 7]`
|
||||
with whatever numbers work for you in order to specify weights of
|
||||
the surrounding pixels.
|
||||
|
||||
- `std.Inflate/Deflate`: Similar to the previous functions, but
|
||||
instead of applying the maximum of pixels, it merges them, which
|
||||
gets you a slight blur of edges. Useful at the end of most masks so
|
||||
you get a slight transition between masked areas.
|
||||
|
||||
We can combine these with the `std.Binarize` function from before to get
|
||||
a nifty output:
|
||||
|
||||
mask = y.std.Sobel()
|
||||
binarize = mask.std.Binarize(3000)
|
||||
maximum = binarize.std.Maximum().std.Maximum()
|
||||
inflate = maximum.std.Inflate().std.Inflate().std.Inflate()
|
||||
|
||||
![Sobel mask from before manipulated with `std.Binarize`, `std.Maximum`,
|
||||
and
|
||||
`std.Inflate`.](Pictures/sobel_manipulated.png){width=".9\\textwidth"}
|
||||
|
||||
A common example of a filter that might smudge the output is an
|
||||
anti-aliasing or a debanding filter. In the case of an anti-aliasing
|
||||
filter, we apply the filter via the mask to the source, while in the
|
||||
case of the debander, we apply the source via the mask to the filtered
|
||||
source:
|
||||
|
||||
mask = y.std.Sobel()
|
||||
|
||||
aa = taa.TAAmbk(src, aatype=3, mtype=0)
|
||||
merge = core.std.MaskedMerge(src, aa, mask)
|
||||
|
||||
deband = src.f3kdb.Deband()
|
||||
merge = core.std.MaskedMerge(deband, src, mask)
|
||||
|
||||
We can also use a different edge mask, namely `kgf.retinex_edgemask`,
|
||||
which raises contrast in dark areas and creates a second edge mask using
|
||||
the output of that, then merges it with the edge mask produced using the
|
||||
untouched image:
|
||||
|
||||
![`kgf.retinex_edgemask` applied to
|
||||
luma.](Pictures/retinex_edgemask.png){width=".9\\textwidth"}
|
||||
|
||||
This already looks great. Let's manipulate it similarly to before and
|
||||
see how it affects a destructive deband in the twig area at the bottom:
|
||||
|
||||
deband = src.f3kdb.Deband(y=150, cb=150, cr=150, grainy=0, grainc=0)
|
||||
mask = kgf.retinex_edgemask(src).std.Binarize(8000).std.Maximum()
|
||||
merge = core.std.MaskedMerge(deband, src, mask)
|
||||
|
||||
![A very strong deband protected using
|
||||
`kgf.retinex_edgemask`.](Pictures/masked_deband.png){width=".9\\textwidth"}
|
||||
|
||||
While some details remain smudged, we've successfully recovered a very
|
||||
noticeable portion of the twigs. Another example of a deband suffering
|
||||
from detail loss without an edge mask can be found under figure
|
||||
[35](#fig:18){reference-type="ref" reference="fig:18"} in the appendix.
|
||||
|
||||
Other noteworthy edge masks easily available in VapourSynth include:
|
||||
|
||||
- `std.Prewitt` is similar to Sobel. It's the same operator with the 2
|
||||
switched out for a 1.
|
||||
|
||||
- `tcanny.TCanny` is basically a Sobel mask thrown over a blurred
|
||||
clip.
|
||||
|
||||
- `kgf.kirsch` will generate almost identical results to
|
||||
`retinex_edgemask` in bright scenes, as it's one of its components.
|
||||
Slower than the others, as it uses more directions, but will get you
|
||||
great results.
|
||||
|
||||
Some edge mask comparisons can be found in the appendix under figures
|
||||
[26](#fig:16){reference-type="ref" reference="fig:16"},
|
||||
[30](#fig:10){reference-type="ref" reference="fig:10"} and
|
||||
[34](#fig:23){reference-type="ref" reference="fig:23"}.
|
||||
|
||||
As a debanding alternative to edge masks, we can also use \"range\"
|
||||
masks, which employ `std.Minimum` and `std.Maximum` to locate details.
|
||||
The most well known example of this is the mask inside `GradFun3`. This
|
||||
works as follows:
|
||||
|
||||
Then, two clips are created, one which will employ `std.Maximum`, while
|
||||
the other obviously will use `std.Minimum`. These use special
|
||||
coordinates depending on the `mrad` value given. If
|
||||
$\mathtt{mrad} \mod 3 = 1$, `[0, 1, 0, 1, 1, 0, 1, 0]` will be used as
|
||||
coordinates. Otherwise, `[1, 1, 1, 1, 1, 1, 1, 1]` is used. Then, this
|
||||
process is repeated with $\mathtt{mrad} = \mathtt{mrad} - 1$ until
|
||||
$\mathtt{mrad} = 0$. This all probably sounds a bit overwhelming, but
|
||||
it's really just finding the maximum and minimum values for each pixel
|
||||
neighborhood.
|
||||
|
||||
Once these are calculated, the minimized mask is subtracted from the
|
||||
maximized mask, and the mask is complete. So, let's look at the output
|
||||
compared to the modified `retinex_edgemask` from earlier:
|
||||
|
||||
![Comparison of `retinex_edgemask.std.Binarize(8000).std.Maximum()` and
|
||||
default `GradFun3`.](Pictures/gradfun3_mask.png){width="100%"}
|
||||
|
||||
Here, we get some more pixels picked up by the `GradFun3` mask in the
|
||||
skies and some brighter flat textures. However, the retinex-type edge
|
||||
mask prevails in darker, more detailed areas. Computationally, our
|
||||
detail mask is a lot quicker, however, and it does pick up a lot of what
|
||||
we want, so it's not a bad choice.
|
||||
|
||||
Fortunately for us, this isn't the end of these kinds of masks. There
|
||||
are two notable masks based on this concept: `debandmask`[^44] and
|
||||
`lvsfunc.denoise.detail_mask`. The former takes our `GradFun3` mask and
|
||||
binarizes it according to the input luma's brightness. Four parameters
|
||||
play a role in this process: `lo`, `hi`, `lothr`, and `hithr`. Values
|
||||
below `lo` are binarized according to `lothr`, values above `hi` are
|
||||
binarized according to `hithr`, and values in between are binarized
|
||||
according to a linear scaling between the two thresholds:
|
||||
|
||||
\\[\frac{\mathtt{mask} - \mathtt{lo}}{\mathtt{hi} - \mathtt{lo}} \times (\mathtt{hithr} - \mathtt{lothr}) + \mathtt{lothr}\\]
|
||||
|
||||
This makes it more useful in our specific scenario, as the mask becomes
|
||||
stronger in darks compared to `GradFun3`. When playing around with the
|
||||
parameters, we can e.. lower `lo` so we our very dark areas aren't
|
||||
affected too badly, lower `lothr` to make it stronger in these darks,
|
||||
raise `hi` to enlarge our `lo` to `hi` gap, and raise `hithr` to weaken
|
||||
it in brights. Simple values might be
|
||||
`lo=22 << 8, lothr=250, hi=48 << 8, hithr=500`:
|
||||
|
||||
![Comparison of `retinex_edgemask.std.Binarize(8000).std.Maximum()`,
|
||||
default `GradFun3`, and default
|
||||
`debandmask(lo=22 << 8, lothr=250, hi=48 << 8, hithr=500)`.](Pictures/debandmask_comparison.png){width="100%"}
|
||||
|
||||
While not perfect, as this is a tough scene, and parameters might not be
|
||||
optimal, the difference in darks is obvious, and less banding is picked
|
||||
up in the background's banding.
|
||||
|
||||
Our other option for an altered `GradFun3` is `lvf.denoise.detail_mask`.
|
||||
This mask combines the previous idea of the `GradFun3` mask with a
|
||||
Prewitt-type edge mask.
|
||||
|
||||
First, two denoised clips are created using `KNLMeansCL`, one with half
|
||||
the other's denoise strength. The stronger one has a `GradFun3`-type
|
||||
mask applied, which is then binarized, while the latter has a Prewitt
|
||||
edge mask applied, which again is binarized. The two are then combined
|
||||
so the former mask gets any edges it may have missed from the latter
|
||||
mask.
|
||||
|
||||
The output is then put through two calls of `RemoveGrain`, the first one
|
||||
setting each pixel to the nearest value of its four surrounding pixel
|
||||
pairs' (e.. top and bottom surrounding pixels make up one pair) highest
|
||||
and lowest average value. The second call effectively performs the
|
||||
following convolution:
|
||||
\\[\begin{bmatrix}
|
||||
1 & 2 & 1 \\\\
|
||||
2 & 4 & 2 \\\\
|
||||
1 & 2 & 1
|
||||
\end{bmatrix}\\]
|
||||
|
||||
By default, the denoiser is turned off, but this is one of its
|
||||
advantages for us in this case, as we'd like the sky to have fewer
|
||||
pixels picked up while we'd prefer more of the rest of the image to be
|
||||
picked up. To compare, I've used a binarize threshold similar to the one
|
||||
used in the `debandmask` example. Keep in mind this is a newer mask, so
|
||||
my inexperience with it might show to those who have played around with
|
||||
it more:
|
||||
|
||||
![Comparison of `retinex_edgemask.std.Binarize(8000).std.Maximum()`,
|
||||
default `GradFun3`, default
|
||||
`debandmask(lo=22 << 8, lothr=250, hi=48 << 8, hithr=500)`, and
|
||||
`detail_mask(pre_denoise=.3, brz_a=300, brz_b=300)`.](Pictures/detail_mask.png){width="100%"}
|
||||
|
||||
Although an improvement in some areas, in this case, we aren't quite
|
||||
getting the step up we would like. Again, better optimized parameters
|
||||
might have helped.
|
||||
|
||||
In case someone wants to play around with the image used here, it's
|
||||
available in this guide's repository:
|
||||
<https://git.concertos.live/Encode_Guide/Encode_Guide/src/branch/master/Pictures/lion.png>.
|
||||
|
||||
Additionally, the following functions can be of help when masking,
|
||||
limiting et cetera:
|
||||
|
||||
- `std.MakeDiff` and `std.MergeDiff`: These should be
|
||||
self-explanatory. Use cases can be applying something to a degrained
|
||||
clip and then merging the clip back, as was elaborated in the
|
||||
Denoising section.
|
||||
|
||||
- `std.Transpose`: Transpose (i.. flip) your clip.
|
||||
|
||||
- `std.Turn180`: Turns by 180 degrees.
|
||||
|
||||
- `std.BlankClip`: Just a frame of a solid color. You can use this to
|
||||
replace bad backgrounds or for cases where you've added grain to an
|
||||
entire movie but you don't want the end credits to be full of grain.
|
||||
To maintain TV range, you can use
|
||||
`std.BlankClip(src, color=[16, 128, 128]`) for 8-bit black. Also
|
||||
useful for making area based masks.
|
||||
|
||||
- `std.Invert`: Self-explanatory. You can also just swap which clip
|
||||
gets merged via the mask instead of doing this.
|
||||
|
||||
- `std.Limiter`: You can use this to limit pixels to certain values.
|
||||
Useful for maintaining TV range (`std.Limiter(min=16, max=235)`).
|
||||
|
||||
- `std.Median`: This replaces each pixel with the median value in its
|
||||
neighborhood. Mostly useless.
|
||||
|
||||
- `std.StackHorizontal`/`std.StackVertical`: Stack clips on top
|
||||
of/next to each other.
|
||||
|
||||
- `std.Merge`: This lets you merge two clips with given weights. A
|
||||
weight of 0 will return the first clip, while 1 will return the
|
||||
second. The first thing you give it is a list of clips, and the
|
||||
second item is a list of weights for each plane. Here's how to merge
|
||||
chroma from the second clip into luma from the first:
|
||||
`std.Merge([first, second], [0, 1])`. If no third value is given,
|
||||
the second one is copied for the third plane.
|
||||
|
||||
- `std.ShufflePlanes`: Extract or merge planes from a clip. For
|
||||
example, you can get the luma plane with
|
||||
`std.ShufflePlanes(src, 0, vs.GRAY)`.
|
||||
|
||||
If you want to apply something to only a certain area, you can use the
|
||||
wrapper `rekt`[^45] or `rekt_fast`. The latter only applies you function
|
||||
to the given area, which speeds it up and is quite useful for
|
||||
anti-aliasing and similar slow filters. Some wrappers around this exist
|
||||
already, like `rektaa` for anti-aliasing. Functions in `rekt_fast` are
|
||||
applied via a lambda function, so instead of `src.f3kdb.Deband()`, you
|
||||
input `rekt_fast(src, lambda x: x.f3kdb.Deband())`.
|
||||
|
||||
One more very special function is `std.FrameEval`. What this allows you
|
||||
to do is evaluate every frame of a clip and apply a frame-specific
|
||||
function. This is quite confusing, but there are some nice examples in
|
||||
VapourSynth's documentation:
|
||||
<http://www.vapoursynth.com/doc/functions/frameeval.html>. Now, unless
|
||||
you're interested in writing a function that requires this, you likely
|
||||
won't ever use it. However, many functions use it, including\
|
||||
`kgf.adaptive_grain`, `awf.FrameInfo`, `fvf.AutoDeblock`, `TAAmbk`, and
|
||||
many more. One example I can think of to showcase this is applying a
|
||||
different debander depending on frame type:
|
||||
|
||||
import functools
|
||||
def FrameTypeDeband(n, f, clip):
|
||||
if clip.props['_PictType'].decode() == "B":
|
||||
return core.f3kdb.Deband(clip, y=64, cr=0, cb=0, grainy=64, grainc=0, keep_tv_range=True, dynamic_grain=False)
|
||||
elif clip.props['_PictType'].decode() == "P":
|
||||
return core.f3kdb.Deband(clip, y=48, cr=0, cb=0, grainy=64, grainc=0, keep_tv_range=True, dynamic_grain=False)
|
||||
else:
|
||||
return core.f3kdb.Deband(clip, y=32, cr=0, cb=0, grainy=64, grainc=0, keep_tv_range=True, dynamic_grain=False)
|
||||
|
||||
out = core.std.FrameEval(src, functools.partial(FrameTypeDeband, clip=src), src)
|
||||
|
||||
If you'd like to learn more, I'd suggest reading through the Irrational
|
||||
Encoding Wizardry GitHub group's guide:
|
||||
<https://guide.encode.moe/encoding/masking-limiting-etc.html> and
|
||||
reading through most of your favorite Python functions for VapourSynth.
|
||||
Pretty much all of the good ones should use some mask or have developed
|
||||
their own mask for their specific use case.
|