Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video

doi:doi:10.57702/pyhelu8l

Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video

The proposed Deep Visual Forced Alignment (DVFA) for time-aligning the input transcription with the input talking face video without using speech audio.

BibTex:

Before browse our site, please accept our cookies policy