diff --git a/.drone.yml b/.drone.yml index 72db13c..f0a7907 100644 --- a/.drone.yml +++ b/.drone.yml @@ -8,6 +8,7 @@ steps: image: alpine/git commands: - apk update && apk add --no-cache git-lfs && git lfs install + - git lfs pull - git submodule update --init --recursive - name: build-repo image: golang:1.18-bullseye @@ -30,6 +31,7 @@ steps: image: alpine/git commands: - apk update && apk add --no-cache git-lfs && git lfs install + - git lfs pull - git submodule update --init --recursive - name: build-repo image: golang:1.18-bullseye diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..4b0a82f --- /dev/null +++ b/.gitattributes @@ -0,0 +1 @@ +*.m4a filter=lfs diff=lfs merge=lfs -text diff --git a/README.md b/README.md index 186949a..540b082 100644 --- a/README.md +++ b/README.md @@ -13,15 +13,15 @@ Collection of audio utilities for decoding/encoding files and streams. ## Codecs supported -| Codec | Containers | Decoder | Analyzer | Encoder | Notes | -|:----------:|:----------------------------------------------------------------------------------------:|:-------:|:--------:|:-------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **FLAC** | [FLAC](https://xiph.org/flac/format.html), [Ogg](https://xiph.org/flac/ogg_mapping.html) | ✅ | ✅ | ✅ | Adjustable encoding compression level and block size.
Decoding/encoding by [libFLAC](https://github.com/xiph/flac) via [goflac](https://git.gammaspectra.live/S.O.N.G/goflac). | -| **TTA** | [TTA](https://www.tausoft.org/en/true_audio_codec_format/) | ✅ | ✅ | ✅ | Decoding/encoding via [go-tta](https://git.gammaspectra.live/S.O.N.G/go-tta). | -| **MP3** | [MP3](http://mpgedit.org/mpgedit/mpeg_format/MP3Format.html) | ✅ | - | ✅ | Adjustable encoding bitrate and mode.
Decoding via [minimp3](https://github.com/kvark128/minimp3), encoding by [LAME](https://lame.sourceforge.io/) via [go-lame](https://github.com/viert/go-lame). | -| **Opus** | [Ogg](https://www.xiph.org/ogg/doc/framing.html) | ✅ | - | ✅ | Adjustable encoding bitrate.
Decoding/encoding by [libopus](https://github.com/xiph/opus) via [go-pus](https://git.gammaspectra.live/S.O.N.G/go-pus). | -| **Vorbis** | [Ogg](https://www.xiph.org/ogg/doc/framing.html) | ✅ | - | ❌ | Decoding by [jfreymuth/vorbis](https://github.com/jfreymuth/vorbis) via [jfreymuth/oggvorbis](https://github.com/jfreymuth/oggvorbis). | -| **AAC** | [ADTS](https://wiki.multimedia.cx/index.php/ADTS) | ✅ | - | ✅ | Adjustable encoding bitrate and mode (LC, HEv2).
Decoding/encoding by [FDK-AAC](https://github.com/mstorsjo/fdk-aac) via [go-fdkaac](https://git.gammaspectra.live/S.O.N.G/go-fdkaac). | -| **ALAC** | MP4* | ✅ | ✅ | ✅ | Decoding/encoding by [libalac](https://git.gammaspectra.live/S.O.N.G/alac) via [go-alac](https://git.gammaspectra.live/S.O.N.G/go-alac).
*MP4 Decoding/encoding only supported on fragmented MP4 currently. | +| Codec | Containers | Decoder | Analyzer | Encoder | Notes | +|:----------:|:----------------------------------------------------------------------------------------:|:-------:|:--------:|:-------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **FLAC** | [FLAC](https://xiph.org/flac/format.html), [Ogg](https://xiph.org/flac/ogg_mapping.html) | ✅ | ✅ | ✅ | Adjustable encoding compression level and block size.
Decoding/encoding by [libFLAC](https://github.com/xiph/flac) via [goflac](https://git.gammaspectra.live/S.O.N.G/goflac). | +| **TTA** | [TTA](https://www.tausoft.org/en/true_audio_codec_format/) | ✅ | ✅ | ✅ | Decoding/encoding via [go-tta](https://git.gammaspectra.live/S.O.N.G/go-tta). | +| **MP3** | [MP3](http://mpgedit.org/mpgedit/mpeg_format/MP3Format.html) | ✅ | - | ✅ | Adjustable encoding bitrate and mode.
Decoding via [minimp3](https://github.com/kvark128/minimp3), encoding by [LAME](https://lame.sourceforge.io/) via [go-lame](https://github.com/viert/go-lame). | +| **Opus** | [Ogg](https://www.xiph.org/ogg/doc/framing.html) | ✅ | - | ✅ | Adjustable encoding bitrate.
Decoding/encoding by [libopus](https://github.com/xiph/opus) via [go-pus](https://git.gammaspectra.live/S.O.N.G/go-pus). | +| **Vorbis** | [Ogg](https://www.xiph.org/ogg/doc/framing.html) | ✅ | - | ❌ | Decoding by [jfreymuth/vorbis](https://github.com/jfreymuth/vorbis) via [jfreymuth/oggvorbis](https://github.com/jfreymuth/oggvorbis). | +| **AAC** | [ADTS](https://wiki.multimedia.cx/index.php/ADTS), ADIF*, MP4** | ✅ | - | ✅ | Adjustable encoding bitrate and mode (LC, HEv2).
Decoding/encoding by [FDK-AAC](https://github.com/mstorsjo/fdk-aac) via [go-fdkaac](https://git.gammaspectra.live/S.O.N.G/go-fdkaac).
If [go-fdkaac](https://git.gammaspectra.live/S.O.N.G/go-fdkaac) codec is disabled, [VisualOn AAC encoder](https://github.com/gen2brain/aac-go) will be used for limited encoding support.
*ADIF only supported on encoding.
**MP4 encoding only supported on fragmented MP4 currently. | +| **ALAC** | MP4* | ✅ | ✅ | ✅ | Decoding/encoding by [libalac](https://git.gammaspectra.live/S.O.N.G/alac) via [go-alac](https://git.gammaspectra.live/S.O.N.G/go-alac).
Disabled by default.
*MP4 encoding only supported on fragmented MP4 currently. | ## Container packetizers supported @@ -33,6 +33,7 @@ Collection of audio utilities for decoding/encoding files and streams. | **Ogg** | ✅ | ✅ | ✅* | *Sample numbers (absolute granule position in Ogg) depend on underlying codec implementing it.
Has been tested as working for Opus | | **ADTS** | ✅ | ✅ | ✅ | Uses [edgeware/mp4ff](https://github.com/edgeware/mp4ff) for its ADTS frame parser. | | **MP4** | ❌ | - | - | | +| **ADIF** | ❌ | - | - | | ## Dependencies ### Go >= 1.18 diff --git a/audio/format/aac/libfdk-aac.go b/audio/format/aac/libfdk-aac.go index 1673c80..b89ad39 100644 --- a/audio/format/aac/libfdk-aac.go +++ b/audio/format/aac/libfdk-aac.go @@ -4,12 +4,16 @@ package aac import ( + "bytes" + "errors" "fmt" "git.gammaspectra.live/S.O.N.G/Kirika/audio" "git.gammaspectra.live/S.O.N.G/Kirika/cgo" "git.gammaspectra.live/S.O.N.G/go-fdkaac/fdkaac" aac_adts "github.com/edgeware/mp4ff/aac" + "github.com/edgeware/mp4ff/mp4" "io" + "time" "unsafe" ) @@ -25,18 +29,16 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "libfdk-aac (go-fdkaac)" + return "libfdk-aac (S.O.N.G/go-fdkaac)" } func decodeFrame(decoder *fdkaac.AacDecoder, r io.Reader) ([]float32, error) { - pcm, err := decoder.Decode() - + pcm, err := tryDecodeFrame(decoder) if err != nil { return nil, err } - if pcm != nil { - return cgo.Int32ToFloat32(cgo.BytesToInt32(pcm, 16), 16), nil + return pcm, err } header, _, err := aac_adts.DecodeADTSHeader(r) @@ -51,8 +53,11 @@ func decodeFrame(decoder *fdkaac.AacDecoder, r io.Reader) ([]float32, error) { return nil, err } - _, err = decoder.Fill(append(header.Encode(), data...)) - + var n int + fullData := append(header.Encode(), data...) + if n, err = decoder.Fill(fullData); n != 0 { + return nil, errors.New("buffer under read") + } if err != nil { return nil, err } @@ -60,54 +65,146 @@ func decodeFrame(decoder *fdkaac.AacDecoder, r io.Reader) ([]float32, error) { return decodeFrame(decoder, r) } +func tryDecodeFrame(decoder *fdkaac.AacDecoder) ([]float32, error) { + pcm, err := decoder.Decode() + + if err != nil { + return nil, err + } + + if pcm != nil { + return cgo.Int32ToFloat32(cgo.BytesToInt32(pcm, 16), 16), nil + } + + return nil, nil +} + +func decodeFrameMP4(decoder *fdkaac.AacDecoder, demuxer *mp4Decoder) (result []float32, err error) { + pcm, err := tryDecodeFrame(decoder) + if err != nil { + return nil, err + } + if pcm != nil { + return pcm, err + } + + samples := demuxer.Read() + if samples == nil { + return nil, io.EOF + } + var n int + for _, sample := range samples { + if n, err = decoder.Fill(sample); n != 0 { + return nil, errors.New("buffer under read") + } + if err != nil { + return nil, err + } + pcm, err = tryDecodeFrame(decoder) + if pcm != nil { + result = append(result, pcm...) + } + } + + return result, nil +} + func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { decoder := fdkaac.NewAacDecoder() - err := decoder.InitAdts() - if err != nil { - return audio.Source{}, err - } - - buf, err := decodeFrame(decoder, r) - - if err != nil { - decoder.Close() - return audio.Source{}, err - } - - newChannel := make(chan []float32) - - go func() { - defer close(newChannel) - defer decoder.Close() - - if len(buf) > 0 { - newChannel <- buf + mp4Demuxer, err := tryDecodeMP4(r) + if err != nil { //try ADTS + r.Seek(0, io.SeekStart) + err = decoder.InitAdts() + if err != nil { + return audio.Source{}, err } - for { - buf, err = decodeFrame(decoder, r) + buf, err := decodeFrame(decoder, r) - if err != nil { - return - } + if err != nil { + decoder.Close() + return audio.Source{}, err + } + + newChannel := make(chan []float32) + + go func() { + defer close(newChannel) + defer decoder.Close() if len(buf) > 0 { newChannel <- buf } - } - }() - return audio.Source{ - Channels: decoder.NumChannels(), - SampleRate: decoder.SampleRate(), - Blocks: newChannel, - }, nil + for { + buf, err = decodeFrame(decoder, r) + + if err != nil { + return + } + + if len(buf) > 0 { + newChannel <- buf + } + } + }() + + return audio.Source{ + Channels: decoder.NumChannels(), + SampleRate: decoder.SampleRate(), + Blocks: newChannel, + }, nil + } else { + return audio.Source{}, fmt.Errorf("unsupported format mp4") + err = decoder.InitRaw(mp4Demuxer.cookie) + if err != nil { + return audio.Source{}, err + } + + buf, err := decodeFrameMP4(decoder, mp4Demuxer) + + if err != nil { + decoder.Close() + return audio.Source{}, err + } + + newChannel := make(chan []float32) + + go func() { + defer close(newChannel) + defer decoder.Close() + + if len(buf) > 0 { + newChannel <- buf + } + + for { + buf, err = decodeFrameMP4(decoder, mp4Demuxer) + + if err != nil { + return + } + + if len(buf) > 0 { + newChannel <- buf + } + } + }() + + return audio.Source{ + Channels: decoder.NumChannels(), + SampleRate: decoder.SampleRate(), + Blocks: newChannel, + }, nil + } + } func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[string]interface{}) error { var bitrate = 128 var isHEv2 bool + var format = "adts" if options != nil { var val interface{} @@ -146,16 +243,33 @@ func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[s } } } + if val, ok = options["format"]; ok { + if strVal, ok = val.(string); ok { + format = strVal + } + } + } + + muxingMode := fdkaac.MuxingModeADTS + + if format == "adts" { + muxingMode = fdkaac.MuxingModeADTS + } else if format == "mp4" { + muxingMode = fdkaac.MuxingModeRAW + } else if format == "adif" { + muxingMode = fdkaac.MuxingModeADIF + } else { + return fmt.Errorf("unsupported format %s", format) } encoder := fdkaac.NewAacEncoder() if isHEv2 { - err := encoder.InitHEv2(source.Channels, source.SampleRate, bitrate*1024, fdkaac.MuxingModeADTS) + err := encoder.InitHEv2(source.Channels, source.SampleRate, bitrate*1024, muxingMode) if err != nil { return err } } else { - err := encoder.InitLc(source.Channels, source.SampleRate, bitrate*1024, fdkaac.MuxingModeADTS) + err := encoder.InitLc(source.Channels, source.SampleRate, bitrate*1024, muxingMode) if err != nil { return err } @@ -164,14 +278,172 @@ func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[s frameSize := encoder.FrameSize() * encoder.Channels() - var buffer []int16 - for block := range source.Blocks { + if format == "mp4" { + init := mp4.CreateEmptyInit() + init.AddEmptyTrack(uint32(source.SampleRate), "audio", "en") + trackId := init.Moov.Mvhd.NextTrackID - 1 + trak := init.Moov.Trak - buffer = append(buffer, cgo.Float32ToInt16(block)...) + objType := aac_adts.AAClc + if isHEv2 { + objType = aac_adts.HEAACv2 + } - for len(buffer) >= frameSize { - sl := buffer[:frameSize] - buf, err := encoder.Encode(unsafe.Slice((*byte)(unsafe.Pointer(&sl[0])), len(sl)*2)) + { + stsd := trak.Mdia.Minf.Stbl.Stsd + asc := &aac_adts.AudioSpecificConfig{ + ObjectType: byte(objType), + ChannelConfiguration: byte(source.Channels), + SamplingFrequency: source.SampleRate, + ExtensionFrequency: 0, + SBRPresentFlag: false, + PSPresentFlag: false, + } + switch objType { + case aac_adts.HEAACv1: + asc.ExtensionFrequency = 2 * source.SampleRate + asc.SBRPresentFlag = true + case aac_adts.HEAACv2: + asc.ExtensionFrequency = 2 * source.SampleRate + asc.SBRPresentFlag = true + asc.ChannelConfiguration = 1 + asc.PSPresentFlag = true + } + + buf := &bytes.Buffer{} + err := asc.Encode(buf) + if err != nil { + return err + } + ascBytes := buf.Bytes() + esds := mp4.CreateEsdsBox(ascBytes) + mp4a := mp4.CreateAudioSampleEntryBox("mp4a", + uint16(asc.ChannelConfiguration), + 16, uint16(source.SampleRate), esds) + stsd.AddChild(mp4a) + } + + init.Encode(writer) + + var seqNumber uint32 + var packetsWritten uint64 + var outputBuffer [][]byte + + segmentDuration := time.Millisecond * 100 + + outputSegment := func() { + seg := mp4.NewMediaSegment() + frag, _ := mp4.CreateFragment(seqNumber, trackId) + seg.AddFragment(frag) + + for _, b := range outputBuffer { + frag.AddFullSampleToTrack(mp4.FullSample{ + Sample: mp4.Sample{ + Dur: uint32(frameSize), + Size: uint32(len(b)), + }, + DecodeTime: uint64(frameSize) * packetsWritten, + Data: b, + }, trackId) + packetsWritten++ + } + + seg.Encode(writer) + seqNumber++ + outputBuffer = nil + } + + outputPacket := func(packet []byte) { + outputBuffer = append(outputBuffer, packet) + + if time.Duration(float64(time.Second)*(float64(frameSize*len(outputBuffer))/float64(source.SampleRate))) >= segmentDuration { + outputSegment() + } + } + + var buffer []int16 + for block := range source.Blocks { + + buffer = append(buffer, cgo.Float32ToInt16(block)...) + + for len(buffer) >= frameSize { + sl := buffer[:frameSize] + buf, err := encoder.Encode(unsafe.Slice((*byte)(unsafe.Pointer(&sl[0])), len(sl)*2)) + + if err != nil { + return err + } + + if len(buf) > 0 { + outputPacket(buf) + } + + buffer = buffer[frameSize:] + } + } + + if len(buffer) > 0 { + //pad + buffer = append(buffer, make([]int16, frameSize-len(buffer))...) + buf, err := encoder.Encode(unsafe.Slice((*byte)(unsafe.Pointer(&buffer[0])), len(buffer)*2)) + + if err != nil { + return err + } + + if len(buf) > 0 { + outputPacket(buf) + } + } + + //Do flush + for { + buf, err := encoder.Flush() + + if err != nil { + return err + } + + if len(buf) > 0 { + outputPacket(buf) + } else { + break + } + } + + if len(outputBuffer) > 0 { + outputSegment() + } + } else { + + var buffer []int16 + for block := range source.Blocks { + + buffer = append(buffer, cgo.Float32ToInt16(block)...) + + for len(buffer) >= frameSize { + sl := buffer[:frameSize] + buf, err := encoder.Encode(unsafe.Slice((*byte)(unsafe.Pointer(&sl[0])), len(sl)*2)) + + if err != nil { + return err + } + + if len(buf) > 0 { + _, err = writer.Write(buf) + if err != nil { + return err + } + } + + buffer = buffer[frameSize:] + } + } + + if len(buffer) > 0 { + //pad + buffer = append(buffer, make([]int16, frameSize-len(buffer))...) + buf, err := encoder.Encode(unsafe.Slice((*byte)(unsafe.Pointer(&buffer[0])), len(buffer)*2)) if err != nil { return err @@ -183,43 +455,24 @@ func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[s return err } } - - buffer = buffer[frameSize:] - } - } - - if len(buffer) > 0 { - //pad - buffer = append(buffer, make([]int16, frameSize-len(buffer))...) - buf, err := encoder.Encode(unsafe.Slice((*byte)(unsafe.Pointer(&buffer[0])), len(buffer)*2)) - - if err != nil { - return err } - if len(buf) > 0 { - _, err = writer.Write(buf) + //Do flush + for { + buf, err := encoder.Flush() + if err != nil { return err } - } - } - //Do flush - for { - buf, err := encoder.Flush() - - if err != nil { - return err - } - - if len(buf) > 0 { - _, err = writer.Write(buf) - if err != nil { - return err + if len(buf) > 0 { + _, err = writer.Write(buf) + if err != nil { + return err + } + } else { + break } - } else { - break } } diff --git a/audio/format/aac/libfdk-aac_test.go b/audio/format/aac/libfdk-aac_test.go index a6e3919..a0c2214 100644 --- a/audio/format/aac/libfdk-aac_test.go +++ b/audio/format/aac/libfdk-aac_test.go @@ -96,6 +96,77 @@ func TestDecodeAAC(t *testing.T) { } } +/* +func TestDecodeAACMP4(t *testing.T) { + t.Parallel() + fp, err := os.Open(test.TestSingleSample24) + if err != nil { + t.Error(err) + return + } + defer fp.Close() + source, err := flac.NewFormat().Open(fp) + if err != nil { + t.Error(err) + return + } + + target, err := os.Open(test.TestSingleSampleAACMP4) + if err != nil { + t.Error(err) + return + } + defer target.Close() + + source, err = NewFormat().Open(target) + if err != nil { + t.Error(err) + return + } + + for range source.Blocks { + + } +} +*/ + +func TestEncodeAACMP4(t *testing.T) { + t.Parallel() + fp, err := os.Open(test.TestSingleSample24) + if err != nil { + t.Error(err) + return + } + defer fp.Close() + source, err := flac.NewFormat().Open(fp) + if err != nil { + t.Error(err) + return + } + + target, err := os.CreateTemp("/tmp", "encode_test_*.m4a") + if err != nil { + t.Error(err) + return + } + + defer func() { + name := target.Name() + target.Close() + os.Remove(name) + }() + + options := make(map[string]interface{}) + options["bitrate"] = "256k" + options["format"] = "mp4" + + err = NewFormat().Encode(source, target, options) + if err != nil { + t.Error(err) + return + } +} + func TestEncodeAACHE(t *testing.T) { t.Parallel() fp, err := os.Open(test.TestSingleSample24) diff --git a/audio/format/aac/mp4.go b/audio/format/aac/mp4.go new file mode 100644 index 0000000..2f11e64 --- /dev/null +++ b/audio/format/aac/mp4.go @@ -0,0 +1,107 @@ +//go:build !disable_format_mp4 +// +build !disable_format_mp4 + +package aac + +import ( + "bytes" + "github.com/edgeware/mp4ff/mp4" + "io" +) + +import ( + "errors" +) + +type mp4Decoder struct { + mp4 *mp4.File + reader io.ReadSeekCloser + trackId uint32 + cookie []byte + + currentSegment int + currentFragment int + currentSample uint32 +} + +func tryDecodeMP4(r io.ReadSeekCloser) (*mp4Decoder, error) { + //TODO: mp4.DecModeLazyMdat errors in segmented files + parsedMp4, err := mp4.DecodeFile(r, mp4.WithDecodeMode(mp4.DecModeNormal)) + if err != nil { + return nil, err + } + + var trackId uint32 + var magicCookie []byte + + for _, trak := range parsedMp4.Moov.Traks { + if box, err := trak.Mdia.Minf.Stbl.Stsd.GetSampleDescription(0); err == nil && box.Type() == "mp4a" { + if aseb, ok := box.(*mp4.AudioSampleEntryBox); ok { + trackId = trak.Tkhd.TrackID + magicCookie = aseb.Esds.DecConfig + break + } + + } + } + + if magicCookie == nil { + return nil, errors.New("could not find track entry") + } + + return &mp4Decoder{ + reader: r, + mp4: parsedMp4, + cookie: magicCookie, + trackId: trackId, + currentSample: 1, + }, nil +} + +func (d *mp4Decoder) Read() (samples [][]byte) { + if d.mp4.IsFragmented() { + if d.currentSegment >= len(d.mp4.Segments) { + //EOF + return nil + } + segment := d.mp4.Segments[d.currentSegment] + + if d.currentFragment >= len(segment.Fragments) { + d.currentSegment++ + d.currentFragment = 0 + return d.Read() + } + + frag := segment.Fragments[d.currentFragment] + + fullSamples, err := frag.GetFullSamples(&mp4.TrexBox{ + TrackID: d.trackId, + }) + if err != nil { + return nil + } + + for _, sample := range fullSamples { + samples = append(samples, sample.Data) + } + d.currentFragment++ + + return + } else { + if d.mp4.Mdat.Data == nil { + return nil + } + for _, trak := range d.mp4.Moov.Traks { + if trak.Tkhd.TrackID == d.trackId { + buf := new(bytes.Buffer) + if err := d.mp4.CopySampleData(buf, d.reader, trak, d.currentSample, d.currentSample); err != nil || buf.Len() == 0 { + return nil + } + d.currentSample++ + + return [][]byte{buf.Bytes()} + } + } + return + } +} diff --git a/audio/format/aac/vo-aacenc.go b/audio/format/aac/vo-aacenc.go index 2b6bf38..7d96865 100644 --- a/audio/format/aac/vo-aacenc.go +++ b/audio/format/aac/vo-aacenc.go @@ -23,7 +23,7 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "vo-aacenc (aac-go)" + return "vo-aacenc (gen2brain/aac-go)" } func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[string]interface{}) error { @@ -67,6 +67,13 @@ func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[s } } } + if val, ok = options["format"]; ok { + if strVal, ok = val.(string); ok { + if strVal != "adts" { + return fmt.Errorf("format %s not supported", strVal) + } + } + } } encoder, err := aac.NewEncoder(writer, &aac.Options{ diff --git a/audio/format/alac/libalac.go b/audio/format/alac/libalac.go index a115607..b18aa4c 100644 --- a/audio/format/alac/libalac.go +++ b/audio/format/alac/libalac.go @@ -26,11 +26,16 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "libalac (go-alac)" + return "libalac (S.O.N.G/go-alac)" } func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { - decoder := go_alac.NewFormatDecoder(r) + mp4Demuxer, err := tryDecodeMP4(r) + if err != nil { + return audio.Source{}, err + } + + decoder := go_alac.NewFrameDecoder(mp4Demuxer.cookie) if decoder == nil { return audio.Source{}, errors.New("could not decode") } @@ -41,11 +46,17 @@ func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { defer close(newChannel) for { - pcm := decoder.Read() - if pcm == nil { + samples := mp4Demuxer.Read() + if samples == nil { return } - newChannel <- cgo.Int32ToFloat32(cgo.BytesToInt32(pcm, decoder.GetBitDepth()), decoder.GetBitDepth()) + for _, sample := range samples { + _, pcm := decoder.ReadPacket(sample) + if pcm == nil { + return + } + newChannel <- cgo.Int32ToFloat32(cgo.BytesToInt32(pcm, decoder.GetBitDepth()), decoder.GetBitDepth()) + } } }() @@ -57,7 +68,12 @@ func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { } func (f Format) OpenAnalyzer(r io.ReadSeekCloser) (audio.Source, format.AnalyzerChannel, error) { - decoder := go_alac.NewFormatDecoder(r) + mp4Demuxer, err := tryDecodeMP4(r) + if err != nil { + return audio.Source{}, nil, err + } + + decoder := go_alac.NewFrameDecoder(mp4Demuxer.cookie) if decoder == nil { return audio.Source{}, nil, errors.New("could not decode") } @@ -70,19 +86,25 @@ func (f Format) OpenAnalyzer(r io.ReadSeekCloser) (audio.Source, format.Analyzer defer close(analyzerChannel) for { - pcm := decoder.Read() - if pcm == nil { + samples := mp4Demuxer.Read() + if samples == nil { return } + for _, sample := range samples { + _, pcm := decoder.ReadPacket(sample) + if pcm == nil { + return + } - intSamples := cgo.BytesToInt32(pcm, decoder.GetBitDepth()) - newChannel <- cgo.Int32ToFloat32(intSamples, decoder.GetBitDepth()) + intSamples := cgo.BytesToInt32(pcm, decoder.GetBitDepth()) + newChannel <- cgo.Int32ToFloat32(intSamples, decoder.GetBitDepth()) - analyzerChannel <- &format.AnalyzerPacket{ - Samples: intSamples, - Channels: decoder.GetChannels(), - BitDepth: decoder.GetBitDepth(), - SampleRate: decoder.GetSampleRate(), + analyzerChannel <- &format.AnalyzerPacket{ + Samples: intSamples, + Channels: decoder.GetChannels(), + BitDepth: decoder.GetBitDepth(), + SampleRate: decoder.GetSampleRate(), + } } } }() diff --git a/audio/format/alac/libalac_test.go b/audio/format/alac/libalac_test.go index f442bd3..2f373f8 100644 --- a/audio/format/alac/libalac_test.go +++ b/audio/format/alac/libalac_test.go @@ -45,3 +45,33 @@ func TestEncodeALAC(t *testing.T) { return } } + +func TestDecodeALAC(t *testing.T) { + t.Parallel() + fp, err := os.Open(test.TestSingleSample24) + if err != nil { + t.Error(err) + return + } + defer fp.Close() + source, err := flac.NewFormat().Open(fp) + if err != nil { + t.Error(err) + return + } + + target, err := os.Open(test.TestSingleSampleALACMP4) + if err != nil { + t.Error(err) + return + } + source, err = NewFormat().Open(target) + if err != nil { + t.Error(err) + return + } + + for range source.Blocks { + + } +} diff --git a/audio/format/alac/mp4.go b/audio/format/alac/mp4.go new file mode 100644 index 0000000..71bebbf --- /dev/null +++ b/audio/format/alac/mp4.go @@ -0,0 +1,107 @@ +//go:build !disable_format_alac +// +build !disable_format_alac + +package alac + +import ( + "bytes" + "errors" + "github.com/edgeware/mp4ff/mp4" + "io" +) + +type mp4Decoder struct { + mp4 *mp4.File + reader io.ReadSeekCloser + trackId uint32 + cookie []byte + + currentSegment int + currentFragment int + currentSample uint32 +} + +func tryDecodeMP4(r io.ReadSeekCloser) (*mp4Decoder, error) { + //TODO: mp4.DecModeLazyMdat errors in segmented files + parsedMp4, err := mp4.DecodeFile(r, mp4.WithDecodeMode(mp4.DecModeNormal)) + if err != nil { + return nil, err + } + + var trackId uint32 + var magicCookie []byte + + for _, trak := range parsedMp4.Moov.Traks { + if box, err := trak.Mdia.Minf.Stbl.Stsd.GetSampleDescription(0); err == nil && box.Type() == "alac" { + trackId = trak.Tkhd.TrackID + + buf := new(bytes.Buffer) + box.Encode(buf) + + boxBytes := buf.Bytes() + + boxOffset := 36 + 12 + + magicCookie = boxBytes[boxOffset:] + break + + } + } + + if magicCookie == nil { + return nil, errors.New("could not find track entry") + } + + return &mp4Decoder{ + reader: r, + mp4: parsedMp4, + cookie: magicCookie, + trackId: trackId, + currentSample: 1, + }, nil +} + +func (d *mp4Decoder) Read() (samples [][]byte) { + if d.mp4.IsFragmented() { + if d.currentSegment >= len(d.mp4.Segments) { + //EOF + return nil + } + segment := d.mp4.Segments[d.currentSegment] + + if d.currentFragment >= len(segment.Fragments) { + d.currentSegment++ + d.currentFragment = 0 + return d.Read() + } + + frag := segment.Fragments[d.currentFragment] + + fullSamples, err := frag.GetFullSamples(&mp4.TrexBox{ + TrackID: d.trackId, + }) + if err != nil { + return nil + } + + for _, sample := range fullSamples { + samples = append(samples, sample.Data) + } + d.currentFragment++ + + return + } else { + for _, trak := range d.mp4.Moov.Traks { + if trak.Tkhd.TrackID == d.trackId { + buf := new(bytes.Buffer) + if err := d.mp4.CopySampleData(buf, d.reader, trak, d.currentSample, d.currentSample); err != nil { + return nil + } + d.currentSample++ + + return [][]byte{buf.Bytes()} + } + } + return + } +} diff --git a/audio/format/flac/flac.go b/audio/format/flac/flac.go index 1d2a9c4..7b47ff4 100644 --- a/audio/format/flac/flac.go +++ b/audio/format/flac/flac.go @@ -24,7 +24,7 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "libFLAC (goflac)" + return "libFLAC (S.O.N.G/goflac)" } func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { diff --git a/audio/format/mp3/mp3_lame.go b/audio/format/mp3/mp3_lame.go index 98be25f..b0d645d 100644 --- a/audio/format/mp3/mp3_lame.go +++ b/audio/format/mp3/mp3_lame.go @@ -14,7 +14,7 @@ import ( ) func (f Format) Description() string { - return "minimp3 / LAME (go-lame)" + return "kvark128/minimp3, LAME (viert/go-lame)" } func (f Format) Encode(source audio.Source, writer io.WriteCloser, options map[string]interface{}) error { diff --git a/audio/format/mp3/mp3_nolame.go b/audio/format/mp3/mp3_nolame.go index ce59807..eba34b4 100644 --- a/audio/format/mp3/mp3_nolame.go +++ b/audio/format/mp3/mp3_nolame.go @@ -4,5 +4,5 @@ package mp3 func (f Format) Description() string { - return "minimp3 / LAME" + return "kvark128/minimp3" } diff --git a/audio/format/opus/opus.go b/audio/format/opus/opus.go index bec72fb..5676cb1 100644 --- a/audio/format/opus/opus.go +++ b/audio/format/opus/opus.go @@ -27,7 +27,7 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "libopus (go-pus)" + return "libopus (S.O.N.G/go-pus)" } func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { diff --git a/audio/format/tta/tta.go b/audio/format/tta/tta.go index a20a621..7020a99 100644 --- a/audio/format/tta/tta.go +++ b/audio/format/tta/tta.go @@ -25,7 +25,7 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "go-tta" + return "S.O.N.G/go-tta" } func NewFormat() Format { diff --git a/audio/format/vorbis/vorbis.go b/audio/format/vorbis/vorbis.go index 2b76430..089a4f1 100644 --- a/audio/format/vorbis/vorbis.go +++ b/audio/format/vorbis/vorbis.go @@ -22,7 +22,7 @@ func (f Format) Name() string { } func (f Format) Description() string { - return "oggvorvis" + return "jfreymuth/oggvorvis" } func (f Format) Open(r io.ReadSeekCloser) (audio.Source, error) { diff --git a/go.sum b/go.sum index b6ebb51..448bf1e 100644 --- a/go.sum +++ b/go.sum @@ -1,5 +1,3 @@ -git.gammaspectra.live/S.O.N.G/go-alac v0.0.0-20220421110341-7839cd4c1da1 h1:D1VyacBGUBfvFD4Fq2eO6RQ5eZPUdC2YsIv9gXun9aQ= -git.gammaspectra.live/S.O.N.G/go-alac v0.0.0-20220421110341-7839cd4c1da1/go.mod h1:f1+h7KOnuM9zcEQp7ri4UaVvgX4m1NFFIXgReIyjGMA= git.gammaspectra.live/S.O.N.G/go-alac v0.0.0-20220421115623-d0b3bfe57e0f h1:CxN7zlk5FdAieyRKQSbwBGBsvQ2cDF8JVCODZpzcRkA= git.gammaspectra.live/S.O.N.G/go-alac v0.0.0-20220421115623-d0b3bfe57e0f/go.mod h1:f1+h7KOnuM9zcEQp7ri4UaVvgX4m1NFFIXgReIyjGMA= git.gammaspectra.live/S.O.N.G/go-ebur128 v0.0.0-20220418202343-73a167e76255 h1:BWRx2ZFyhp5+rsXhdDZtk5Gld+L44lxlN9ASqB9Oj0M= diff --git a/resources/aac.m4a b/resources/aac.m4a new file mode 100644 index 0000000..b096cc1 --- /dev/null +++ b/resources/aac.m4a @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:93cfcf7d6407e4330d454e525f69e3879debfdc671190d98fd6cbb56c56f41d3 +size 6381080 diff --git a/resources/alac.m4a b/resources/alac.m4a new file mode 100644 index 0000000..ea6e20a --- /dev/null +++ b/resources/alac.m4a @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a05f48eaaa2cddcfdfa72b32a852db6459e2b13eaac077b2e080fec34edcc0bc +size 86563510 diff --git a/test/constants.go b/test/constants.go index 43dfe24..f692a86 100644 --- a/test/constants.go +++ b/test/constants.go @@ -25,3 +25,5 @@ const TestSingleSample24 = "resources/samples/cYsmix - Haunted House/11. The Gre const TestSingleSample16 = "resources/samples/Babbe Music - RADIANT DANCEFLOOR/01. ENTER.flac" const TestSingleSample16TTA = "resources/samples/Babbe Music - RADIANT DANCEFLOOR/01. ENTER.tta" +const TestSingleSampleAACMP4 = "resources/aac.m4a" +const TestSingleSampleALACMP4 = "resources/alac.m4a"