Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • P PyAV
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 37
    • Issues 37
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 26
    • Merge requests 26
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • PyAV
  • PyAV
  • Issues
  • #1044
Closed
Open
Issue created Oct 24, 2022 by Administrator@rootContributor6 of 6 checklist items completed6/6 checklist items

[av v10.0] The output container does not put audio stream under audio but under other instead

Created by: YosuaMichael

Overview

In av==10.0, when we do add stream for audio steam in output container: container.add_stream('aac', rate=44100), we notice that container.streams.audio is still empty tuple, and instead the audio steam actually goes to container.streams.other instead (see reproduction code below).

Note that this bug is not happening in av==9.2.

Expected behavior

When we do container.add_stream('aac', rate=44100) we expect container.streams.audio to be not empty

Actual behavior

When we do container.add_stream('aac', rate=44100) we expect container.streams.audio is empty, and container.streams.other is not empty instead.

Reproduction

# Simplified from torchvision code: https://github.com/pytorch/vision/blob/main/torchvision/io/video.py#L99
import av
import torch
import numpy as np

audio_fps = 44100
audio_codec = 'aac'
audio_layout = "stereo"
audio_array = torch.rand((2, 44100))

container = av.open("test_write.mp4", mode="w")
a_stream = container.add_stream(audio_codec, rate=audio_fps)
audio_format_dtypes = {
    "dbl": "<f8",
    "dblp": "<f8",
    "flt": "<f4",
    "fltp": "<f4",
    "s16": "<i2",
    "s16p": "<i2",
    "s32": "<i4",
    "s32p": "<i4",
    "u8": "u1",
    "u8p": "u1",
}
audio_sample_fmt = container.streams[0].format.name
format_dtype = np.dtype(audio_format_dtypes[audio_sample_fmt])
audio_array = torch.as_tensor(audio_array).numpy().astype(format_dtype)
frame = av.AudioFrame.from_ndarray(audio_array, format=audio_sample_fmt, layout=audio_layout)

frame.sample_rate = audio_fps

for packet in a_stream.encode(frame):
    container.mux(packet)

for packet in a_stream.encode():
    container.mux(packet)
    
container.close()

print(f"container.streams.audio: {container.streams.audio}")
print(f"container.streams.other: {container.streams.other}")

# Using av==10.0
# container.streams.audio: ()
# container.streams.other: (<av.Stream #0 audio/aac at 0x12551fe20>,)


# Using av==9.2
# container.streams.audio: (<av.AudioStream #0 aac at 44100Hz, stereo, fltp at 0x1275920e0>,)
# container.streams.other: ()

Versions

  • OS: macOS 12.6
  • PyAV runtime:
PyAV v10.0.0
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --arch=arm64 --enable-cross-compile --disable-alsa --disable-doc --disable-mediafoundation --enable-fontconfig --enable-gmp --disable-gnutls --enable-gpl --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --disable-libxcb --enable-libxml2 --enable-libxvid --enable-lzma --enable-version3 --enable-zlib
library license: GPL version 3 or later
libavcodec     59. 37.100
libavdevice    59.  7.100
libavfilter     8. 44.100
libavformat    59. 27.100
libavutil      57. 28.100
libswresample   4.  7.100
libswscale      6.  7.100
  • PyAV build:
Install from `pip install av==10.0`
  • FFmpeg:
ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
built with clang version 12.0.0
configuration: --prefix=/Users/yosuamichael/opt/miniconda3/envs/tv --cc=arm64-apple-darwin20.0.0-clang --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264
libavutil      56. 31.100 / 56. 31.100
libavcodec     58. 54.100 / 58. 54.100
libavformat    58. 29.100 / 58. 29.100
libavdevice    58.  8.100 / 58.  8.100
libavfilter     7. 57.100 /  7. 57.100
libavresample   4.  0.  0 /  4.  0.  0
libswscale      5.  5.100 /  5.  5.100
libswresample   3.  5.100 /  3.  5.100
libpostproc    55.  5.100 / 55.  5.100

Research

I have done the following:

  • Checked the PyAV documentation
  • Searched on Google
  • Searched on Stack Overflow
  • Looked through old GitHub issues
  • Asked on PyAV Gitter
  • ... and waited 72 hours for a response.

Additional context

This behaviour breaks the torchvision when it try to write video with pyav==10.0, here is a related issue: https://github.com/pytorch/vision/issues/6814.

Assignee
Assign to
Time tracking