Simpleoier Librispeech Asr Train Asr Conformer7 Wavlm Large Raw En Bpe5000 Sp
S
Simpleoier Librispeech Asr Train Asr Conformer7 Wavlm Large Raw En Bpe5000 Sp
Developed by espnet
An automatic speech recognition (ASR) model trained on the ESPnet framework, using the Conformer architecture and the WavLM large pre-trained model, trained on the LibriSpeech dataset.
Downloads 66
Release Time : 3/2/2022
Model Overview
This model is a high-performance English automatic speech recognition system designed to process raw audio input and convert it into text.
Model Features
High-performance architecture
Combines Conformer7 and the WavLM large pre-trained model to deliver exceptional speech recognition capabilities
LibriSpeech training
Trained on the widely-used LibriSpeech dataset, ensuring robustness in various speech conditions
Low error rate
Outstanding performance on test sets, with a word error rate (WER) as low as 1.8% on clean speech and 3.7% on noisy speech
Model Capabilities
English speech recognition
Raw audio processing
Large-scale speech-to-text conversion
Use Cases
Speech transcription
Meeting minutes
Automatically transcribe meeting recordings
Accuracy up to 98.4% (test set clean data)
Audio caption generation
Generate subtitles for podcasts or video content
Maintains 96.7% accuracy even in noisy speech environments
Voice assistants
Voice command recognition
Recognize and execute voice commands
🚀 ESPnet2 ASR model
This ESPnet2 ASR model, specifically espnet/simpleoier_librispeech_asr_train_asr_conformer7_wavlm_large_raw_en_bpe5000_sp
, is designed for automatic speech recognition tasks. It offers high - performance recognition capabilities and is trained on the Librispeech dataset.
📦 Installation
This model was trained by simpleoier using the librispeech recipe in espnet. Here's how you can use it in ESPnet2:
cd espnet
git checkout b0ff60946ada6753af79423a2e6063984bec2926
pip install -e .
cd egs2/librispeech/asr1
./run.sh --skip_data_prep false --skip_train true --download_model espnet/simpleoier_librispeech_asr_train_asr_conformer7_wavlm_large_raw_en_bpe5000_sp
📊 RESULTS
Environments
- date:
Tue Jan 4 20:52:48 EST 2022
- python version:
3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
- espnet version:
espnet 0.10.5a1
- pytorch version:
pytorch 1.8.1
- Git hash: ``
- Commit date: ``
asr_train_asr_conformer7_wavlm_large_raw_en_bpe5000_sp
WER
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/dev_clean | 2703 | 54402 | 98.4 | 1.4 | 0.1 | 0.2 | 1.7 | 23.1 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/dev_other | 2864 | 50948 | 96.7 | 3.0 | 0.3 | 0.3 | 3.6 | 35.5 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/test_clean | 2620 | 52576 | 98.4 | 1.5 | 0.1 | 0.2 | 1.8 | 23.7 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/test_other | 2939 | 52343 | 96.7 | 3.0 | 0.3 | 0.4 | 3.7 | 37.9 |
CER
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/dev_clean | 2703 | 288456 | 99.7 | 0.2 | 0.2 | 0.2 | 0.5 | 23.1 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/dev_other | 2864 | 265951 | 98.9 | 0.6 | 0.4 | 0.4 | 1.5 | 35.5 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/test_clean | 2620 | 281530 | 99.6 | 0.2 | 0.2 | 0.2 | 0.6 | 23.7 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/test_other | 2939 | 272758 | 99.1 | 0.5 | 0.4 | 0.4 | 1.3 | 37.9 |
TER
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/dev_clean | 2703 | 68010 | 98.2 | 1.4 | 0.4 | 0.3 | 2.1 | 23.1 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/dev_other | 2864 | 63110 | 96.0 | 3.1 | 0.9 | 0.9 | 4.9 | 35.5 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/test_clean | 2620 | 65818 | 98.1 | 1.4 | 0.5 | 0.4 | 2.3 | 23.7 |
decode_asr_lm_lm_train_lm_transformer2_en_bpe5000_valid.loss.ave_asr_model_valid.acc.ave/test_other | 2939 | 65101 | 96.1 | 2.9 | 1.0 | 0.8 | 4.7 | 37.9 |
📄 ASR config
expand
config: conf/tuning/train_asr_conformer7_wavlm_large.yaml
print_config: false
log_level: INFO
dry_run: false
iterator_type: sequence
output_dir: exp/asr_train_asr_conformer7_wavlm_large_raw_en_bpe5000_sp
ngpu: 1
seed: 0
num_workers: 1
num_att_plot: 3
num_targets: 1
dist_backend: nccl
dist_init_method: env://
dist_world_size: 2
dist_rank: 0
local_rank: 0
dist_master_addr: localhost
dist_master_port: 45342
dist_launcher: null
multiprocessing_distributed: true
unused_parameters: false
sharded_ddp: false
cudnn_enabled: true
cudnn_benchmark: false
cudnn_deterministic: true
collect_stats: false
write_collected_feats: false
max_epoch: 35
patience: null
val_scheduler_criterion:
- valid
- loss
early_stopping_criterion:
- valid
- loss
- min
best_model_criterion:
- - valid
- acc
- max
keep_nbest_models: 10
nbest_averaging_interval: 0
grad_clip: 5.0
grad_clip_type: 2.0
grad_noise: false
accum_grad: 3
no_forward_run: false
resume: true
train_dtype: float32
use_amp: false
log_interval: null
use_tensorboard: true
use_wandb: false
wandb_project: null
wandb_id: null
wandb_entity: null
wandb_name: null
wandb_model_log_interval: -1
detect_anomaly: false
pretrain_path: null
init_param: []
ignore_init_mismatch: false
freeze_param:
- frontend.upstream
num_iters_per_epoch: null
batch_size: 20
valid_batch_size: null
batch_bins: 40000000
valid_batch_bins: null
train_shape_file:
- exp/asr_stats_raw_en_bpe5000_sp/train/speech_shape
- exp/asr_stats_raw_en_bpe5000_sp/train/text_shape.bpe
valid_shape_file:
- exp/asr_stats_raw_en_bpe5000_sp/valid/speech_shape
- exp/asr_stats_raw_en_bpe5000_sp/valid/text_shape.bpe
batch_type: numel
valid_batch_type: null
fold_length:
- 80000
- 150
sort_in_batch: descending
sort_batch: descending
multiple_iterator: false
chunk_length: 500
chunk_shift_ratio: 0.5
num_cache_chunks: 1024
train_data_path_and_name_and_type:
- - dump/raw/train_960_sp/wav.scp
- speech
- kaldi_ark
- - dump/raw/train_960_sp/text
- text
- text
valid_data_path_and_name_and_type:
- - dump/raw/dev/wav.scp
- speech
- kaldi_ark
- - dump/raw/dev/text
- text
- text
allow_variable_data_keys: false
max_cache_size: 0.0
max_cache_fd: 32
valid_max_cache_size: null
optim: adam
optim_conf:
lr: 0.0025
scheduler: warmuplr
scheduler_conf:
warmup_steps: 40000
token_list:
- <blank>
- <unk>
- ▁THE
- S
- ▁AND
- ▁OF
- ▁TO
- ▁A
- ▁IN
- ▁I
- ▁HE
- ▁THAT
- ▁WAS
- ED
- ▁IT
- ''''
- ▁HIS
- ING
- ▁YOU
- ▁WITH
- ▁FOR
- ▁HAD
- T
- ▁AS
- ▁HER
- ▁IS
- ▁BE
- ▁BUT
- ▁NOT
- ▁SHE
- D
- ▁AT
- ▁ON
- LY
- ▁HIM
- ▁THEY
- ▁ALL
- ▁HAVE
- ▁BY
- ▁SO
- ▁THIS
- ▁MY
- ▁WHICH
- ▁ME
- ▁SAID
- ▁FROM
- ▁ONE
- Y
- E
- ▁WERE
- ▁WE
- ▁NO
- N
- ▁THERE
- ▁OR
- ER
- ▁AN
- ▁WHEN
- ▁ARE
- ▁THEIR
- ▁WOULD
- ▁IF
- ▁WHAT
- ▁THEM
- ▁WHO
- ▁OUT
- M
- ▁DO
- ▁WILL
- ▁UP
- ▁BEEN
- P
- R
- ▁MAN
- ▁THEN
- ▁COULD
- ▁MORE
- C
- ▁INTO
- ▁NOW
- ▁VERY
- ▁YOUR
- ▁SOME
- ▁LITTLE
- ES
- ▁TIME
- RE
- ▁CAN
- ▁LIKE
- LL
- ▁ABOUT
- ▁HAS
- ▁THAN
- ▁DID
- ▁UPON
- ▁OVER
- IN
- ▁ANY
- ▁WELL
- ▁ONLY
- B
- ▁SEE
- ▁GOOD
- ▁OTHER
- ▁TWO
- L
- ▁KNOW
- ▁GO
- ▁DOWN
- ▁BEFORE
- A
- AL
- ▁OUR
- ▁OLD
- ▁SHOULD
- ▁MADE
- ▁AFTER
- ▁GREAT
- ▁DAY
- ▁MUST
- ▁COME
- ▁HOW
- ▁SUCH
- ▁CAME
- LE
- ▁WHERE
- ▁US
- ▁NEVER
- ▁THESE
- ▁MUCH
- ▁DE
- ▁MISTER
- ▁WAY
- G
- ▁S
- ▁MAY
- ATION
- ▁LONG
- OR
- ▁AM
- ▁FIRST
- ▁BACK
- ▁OWN
- ▁RE
- ▁AGAIN
- ▁SAY
- ▁MEN
- ▁WENT
- ▁HIMSELF
- ▁HERE
- NESS
- ▁THINK
- V
- IC
- ▁EVEN
- ▁THOUGHT
- ▁HAND
- ▁JUST
- ▁O
- ▁UN
- VE
- ION
- ▁ITS
- 'ON'
- ▁MAKE
- ▁MIGHT
- ▁TOO
- K
- ▁AWAY
- ▁LIFE
- TH
- ▁WITHOUT
- ST
- ▁THROUGH
- ▁MOST
- ▁TAKE
- ▁DON
- ▁EVERY
- F
- O
- ▁SHALL
- ▁THOSE
- ▁EYES
- AR
- ▁STILL
- ▁LAST
- ▁HOUSE
- ▁HEAD
- ABLE
- ▁NOTHING
- ▁NIGHT
- ITY
- ▁LET
- ▁MANY
- ▁OFF
- ▁BEING
- ▁FOUND
- ▁WHILE
- EN
- ▁SAW
- ▁GET
- ▁PEOPLE
- ▁FACE
- ▁YOUNG
- CH
- ▁UNDER
- ▁ONCE
- ▁TELL
- AN
- ▁THREE
- ▁PLACE
- ▁ROOM
- ▁YET
- ▁SAME
- IL
- US
- U
- ▁FATHER
- ▁RIGHT
- EL
- ▁THOUGH
- ▁ANOTHER
- LI
- RI
- ▁HEART
- IT
- ▁PUT
- ▁TOOK
- ▁GIVE
- ▁EVER
- ▁E
- ▁PART
- ▁WORK
- ERS
- ▁LOOK
- ▁NEW
- ▁KING
- ▁MISSUS
- ▁SIR
- ▁LOVE
- ▁MIND
- ▁LOOKED
- W
- RY
- ▁ASKED
- ▁LEFT
- ET
- ▁LIGHT
- CK
- ▁DOOR
- ▁MOMENT
- RO
- ▁WORLD
- ▁THINGS
- ▁HOME
- UL
- ▁THING
- LA
- ▁WHY
- ▁MOTHER
- ▁ALWAYS
- ▁FAR
- FUL
- ▁WATER
- CE
- IVE
- UR
- ▁HEARD
- ▁SOMETHING
- ▁SEEMED
- I
- LO
- ▁BECAUSE
- OL
- ▁END
- ▁TOLD
- ▁CON
- ▁YES
- ▁GOING
- ▁GOT
- RA
- IR
- ▁WOMAN
- ▁GOD
- EST
- TED
- ▁FIND
- ▁KNEW
- ▁SOON
- ▁EACH
- ▁SIDE
- H
- TON
- MENT
- ▁OH
- NE
- Z
- LING
- ▁AGAINST
- TER
- ▁NAME
- ▁MISS
- ▁QUITE
- ▁WANT
- ▁YEARS
- ▁FEW
- ▁BETTER
- ENT
- ▁HALF
- ▁DONE
- ▁ALSO
- ▁BEGAN
- ▁HAVING
- ▁ENOUGH
- IS
- ▁LADY
- ▁WHOLE
- LESS
- ▁BOTH
- ▁SEEN
- ▁SET
- ▁WHITE
- ▁COURSE
- IES
- ▁VOICE
- ▁CALLED
- ▁D
- ▁EX
- ATE
- ▁TURNED
- ▁GAVE
- ▁C
- ▁POOR
- MAN
- UT
- NA
- ▁DEAR
- ISH
- ▁GIRL
- ▁MORNING
- ▁BETWEEN
- LED
- ▁NOR
- IA
- ▁AMONG
- MA
- ▁
- ▁SMALL
- ▁REST
- ▁WHOM
- ▁FELT
- ▁HANDS
- ▁MYSELF
- ▁HIGH
- ▁M
- ▁HOWEVER
- ▁HERSELF
- ▁P
- CO
- ▁STOOD
- ID
- ▁KIND
- ▁HUNDRED
- AS
- ▁ROUND
- ▁ALMOST
- TY
- ▁SINCE
- ▁G
- AM
- ▁LA
- SE
- ▁BOY
- ▁MA
- ▁PERHAPS
- ▁WORDS
- ATED
- ▁HO
- X
- ▁MO
- ▁SAT
- ▁REPLIED
- ▁FOUR
- ▁ANYTHING
- ▁TILL
- ▁UNTIL
- ▁BLACK
- TION
- ▁CRIED
- RU
- TE
- ▁FACT
- ▁HELP
- ▁NEXT
- ▁LOOKING
- ▁DOES
- ▁FRIEND
- ▁LAY
- ANCE
- ▁POWER
- ▁BROUGHT
- VER
- ▁FIRE
- ▁KEEP
- PO
- FF
- ▁COUNTRY
- ▁SEA
- ▁WORD
- ▁CAR
- ▁DAYS
- ▁TOGETHER
- ▁IMP
- ▁REASON
- KE
- ▁INDEED
- TING
- ▁MATTER
- ▁FULL
- ▁TEN
- TIC
- ▁LAND
- ▁RATHER
- ▁AIR
- ▁HOPE
- ▁DA
- ▁OPEN
- ▁FEET
- ▁EN
- ▁FIVE
- ▁POINT
- ▁CO
- OM
- ▁LARGE
- ▁B
- ▁CL
- ME
- ▁GONE
- ▁CHILD
- INE
- GG
- ▁BEST
- ▁DIS
- UM
- ▁HARD
- ▁LORD
- OUS
- ▁WIFE
- ▁SURE
- ▁FORM
- DE
- ▁DEATH
- ANT
- ▁NATURE
- ▁BA
- ▁CARE
- ▁BELIEVE
- PP
- ▁NEAR
- ▁RO
- ▁RED
- ▁WAR
- IE
- ▁SPEAK
- ▁FEAR
- ▁CASE
- ▁TAKEN
- ▁ALONG
- ▁CANNOT
- ▁HEAR
- ▁THEMSELVES
- CI
- ▁PRESENT
- AD
- ▁MASTER
- ▁SON
- ▁THUS
- ▁LI
- ▁LESS
- ▁SUN
- ▁TRUE
- IM
- IOUS
- ▁THOUSAND
- ▁MONEY
- ▁W
- ▁BEHIND
- ▁CHILDREN
- ▁DOCTOR
- AC
- ▁TWENTY
- ▁WISH
- ▁SOUND
- ▁WHOSE
- ▁LEAVE
- ▁ANSWERED
- ▁THOU
- ▁DUR
- ▁HA
- ▁CERTAIN
- ▁PO
- ▁PASSED
- GE
- TO
- ▁ARM
- ▁LO
- ▁STATE
- ▁ALONE
- TA
- ▁SHOW
- ▁NEED
- ▁LIVE
- ND
- ▁DEAD
- ENCE
- ▁STRONG
- ▁PRE
- ▁TI
- ▁GROUND
- SH
- TI
- ▁SHORT
- IAN
- UN
- ▁PRO
- ▁HORSE
- MI
- ▁PRINCE
- ARD
- ▁FELL
- ▁ORDER
- ▁CALL
- AT
- ▁GIVEN
- ▁DARK
- ▁THEREFORE
- ▁CLOSE
- ▁BODY
- ▁OTHERS
- ▁SENT
- ▁SECOND
- ▁OFTEN
- ▁CA
- ▁MANNER
- MO
- NI
- ▁BRING
- ▁QUESTION
- ▁HOUR
- ▁BO
- AGE
- ▁ST
- ▁TURN
- ▁TABLE
- ▁GENERAL
- ▁EARTH
- ▁BED
- ▁REALLY
- ▁SIX
- 'NO'
- IST
- ▁BECOME
- ▁USE
- ▁READ
- ▁SE
- ▁VI
- ▁COMING
- ▁EVERYTHING
- ▁EM
- ▁ABOVE
- ▁EVENING
- ▁BEAUTIFUL
- ▁FEEL
- ▁RAN
- ▁LEAST
- ▁LAW
- ▁ALREADY
- ▁MEAN
- ▁ROSE
- WARD
- ▁ITSELF
- ▁SOUL
- ▁SUDDENLY
- ▁AROUND
- RED
- ▁ANSWER
- ICAL
- ▁RA
- ▁WIND
- ▁FINE
- ▁WON
- ▁WHETHER
- ▁KNOWN
- BER
- NG
- ▁TA
- ▁CAPTAIN
- ▁EYE
- ▁PERSON
- ▁WOMEN
- ▁SORT
- ▁ASK
- ▁BROTHER
- ▁USED
- ▁HELD
- ▁BIG
- ▁RETURNED
- ▁STRANGE
- ▁BU
- ▁PER
- ▁FREE
- ▁EITHER
- ▁WITHIN
- ▁DOUBT
- ▁YEAR
- ▁CLEAR
- ▁SIGHT
- ▁GRA
- ▁LOST
- ▁KEPT
- ▁F
- PE
- ▁BAR
- ▁TOWN
- ▁SLEEP
- ARY
- ▁HAIR
- ▁FRIENDS
- ▁DREAM
- ▁FELLOW
- PER
- ▁DEEP
- QUE
- ▁BECAME
- ▁REAL
- ▁PAST
- ▁MAKING
- RING
- ▁COMP
- ▁ACT
- ▁BAD
- HO
- STER
- ▁YE
- ▁MEANS
- ▁RUN
- MEN
- ▁DAUGHTER
- ▁SENSE
- ▁CITY
- ▁SOMETIMES
- ▁TOWARDS
- ▁ROAD
- ▁SP
- ▁LU
- ▁READY
- ▁FOOT
- ▁COLD
- ▁SA
- ▁LETTER
- ▁ELSE
- ▁MAR
- ▁STA
- BE
- ▁TRUTH
- ▁LE
- BO
- ▁BUSINESS
- CHE
- ▁JOHN
- ▁SUBJECT
- ▁COURT
- ▁IDEA
- ILY
- ▁RIVER
- ATING
- ▁FAMILY
- HE
- ▁DIDN
- ▁GLAD
- ▁SEVERAL
- IAL
- ▁UNDERSTAND
- ▁SC
- ▁POSSIBLE
- ▁DIFFERENT
- ▁RETURN
- ▁ARMS
- ▁LOW
- ▁HOLD
- ▁TALK
- ▁RU
- ▁WINDOW
- ▁INTEREST
- ▁SISTER
- SON
- ▁SH
- ▁BLOOD
- ▁SAYS
- ▁CAP
- ▁DI
- ▁HUMAN
- ▁CAUSE
- NCE
- ▁THANK
- ▁LATE
- GO
- ▁CUT
- ▁ACROSS
- ▁STORY
- NT
- ▁COUNT
- ▁ABLE
- DY
- LEY
- ▁NUMBER
- ▁STAND
- ▁CHURCH
- ▁THY
- ▁SUPPOSE
- LES
- BLE
- OP
- ▁EFFECT
- BY
- ▁K
- ▁NA
- ▁SPOKE
- ▁MET
- ▁GREEN
- ▁HUSBAND
- ▁RESPECT
- ▁PA
- ▁FOLLOWED
- ▁REMEMBER
- ▁LONGER
- ▁AGE
- ▁TAKING
- ▁LINE
- ▁SEEM
- ▁HAPPY
- LAND
- EM
- ▁STAY
- ▁PLAY
- ▁COMMON
- ▁GA
- ▁BOOK
- ▁TIMES
- ▁OBJECT
- ▁SEVEN
- QUI
- DO
- UND
- ▁FL
- ▁PRETTY
- ▁FAIR
- WAY
- ▁WOOD
- ▁REACHED
- ▁APPEARED
- ▁SWEET
- ▁FALL
- BA
- ▁PASS
- ▁SIGN
- ▁TREE
- IONS
- ▁GARDEN
- ▁ILL
- ▁ART
- ▁REMAIN
- ▁OPENED
- ▁BRIGHT
- ▁STREET
- ▁TROUBLE
- ▁PAIN
- ▁CONTINUED
- ▁SCHOOL
- OUR
- ▁CARRIED
- ▁SAYING
- HA
- ▁CHANGE
- ▁FOLLOW
- ▁GOLD
- ▁SW
- ▁FEELING
- ▁COMMAND
- ▁BEAR
- ▁CERTAINLY
- ▁BLUE
- ▁NE
- CA
- ▁WILD
- ▁ACCOUNT
- ▁OUGHT
- UD
- ▁T
- ▁BREATH
- ▁WANTED
- ▁RI
- ▁HEAVEN
- ▁PURPOSE
- ▁CHARACTER
- ▁RICH
- ▁PE
- ▁DRESS
- OS
- FA
- ▁TH
- ▁ENGLISH
- ▁CHANCE
- ▁SHIP
- ▁VIEW
- ▁TOWARD
- AK
- ▁JOY
- ▁JA
- ▁HAR
- ▁NEITHER
- ▁FORCE
- ▁UNCLE
- DER
- ▁PLAN
- ▁PRINCESS
- DI
- ▁CHIEF
- ▁HAT
- ▁LIVED
- ▁AB
- ▁VISIT
- ▁MOR
- TEN
- ▁WALL
- UC
- ▁MINE
- ▁PLEASURE
- ▁SMILE
- ▁FRONT
- ▁HU
- ▁DEAL
- OW
- ▁FURTHER
- GED
- ▁TRIED
- DA
- VA
- ▁NONE
- ▁ENTERED
- ▁QUEEN
- ▁PAY
- ▁EL
- ▁EXCEPT
- ▁SHA
- ▁FORWARD
- ▁EIGHT
- ▁ADDED
- ▁PUBLIC
- ▁EIGHTEEN
- ▁STAR
- ▁HAPPENED
- ▁LED
- ▁WALKED
- ▁ALTHOUGH
- ▁LATER
- ▁SPIRIT
- ▁WALK
- ▁BIT
- ▁MEET
- LIN
- ▁FI
- LT
- ▁MOUTH
- ▁WAIT
- ▁HOURS
- ▁LIVING
- ▁YOURSELF
- ▁FAST
- ▁CHA
- ▁HALL
- ▁BEYOND
- ▁BOAT
- ▁SECRET
- ENS
- ▁CHAIR
- RN
- ▁RECEIVED
- ▁CAT
- RESS
- ▁DESIRE
- ▁GENTLEMAN
- UGH
- ▁LAID
- EVER
- ▁OCCASION
- ▁WONDER
- ▁GU
- ▁PARTY
- DEN
- ▁FISH
- ▁SEND
- ▁NEARLY
- ▁TRY
- CON
- ▁SEEMS
- RS
- ▁BELL
- ▁BRA
- ▁SILENCE
- IG
- ▁GUARD
- ▁DIE
- ▁DOING
- ▁TU
- ▁COR
- ▁EARLY
- ▁BANK
- ▁FIGURE
- IF
- ▁ENGLAND
- ▁MARY
- ▁AFRAID
- LER
- ▁FO
- ▁WATCH
- ▁FA
- ▁VA
- ▁GRE
- ▁AUNT
- PED
- ▁SERVICE
- ▁JE
- ▁PEN
- ▁MINUTES
- ▁PAN
- ▁TREES
- NED
- ▁GLASS
- ▁TONE
- ▁PLEASE
- ▁FORTH
- ▁CROSS
- ▁EXCLAIMED
- ▁DREW
- ▁EAT
- ▁AH
- ▁GRAVE
- ▁CUR
- PA
- URE
- CENT
- ▁MILES
- ▁SOFT
- ▁AGO
- ▁POSITION
- ▁WARM
- ▁LENGTH
- ▁NECESSARY
- ▁THINKING
- ▁PICTURE
- ▁PI
- SHIP
- IBLE
- ▁HEAVY
- ▁ATTENTION
- ▁DOG
- ABLY
- ▁STANDING
- ▁NATURAL
- ▁APPEAR
- OV
- ▁CAUGHT
- VO
- ISM
- ▁SPRING
- ▁EXPERIENCE
- ▁PAT
- OT
- ▁STOPPED
- ▁REGARD
- ▁HARDLY
- ▁SELF
- ▁STRENGTH
- ▁
📄 License
This project is licensed under the cc - by - 4.0
license.
Voice Activity Detection
MIT
Voice activity detection model based on pyannote.audio 2.1, used to identify speech activity segments in audio
Speech Recognition
V
pyannote
7.7M
181
Wav2vec2 Large Xlsr 53 Portuguese
Apache-2.0
This is a fine-tuned XLSR-53 large model for Portuguese speech recognition tasks, trained on the Common Voice 6.1 dataset, supporting Portuguese speech-to-text conversion.
Speech Recognition Other
W
jonatasgrosman
4.9M
32
Whisper Large V3
Apache-2.0
Whisper is an advanced automatic speech recognition (ASR) and speech translation model proposed by OpenAI, trained on over 5 million hours of labeled data, with strong cross-dataset and cross-domain generalization capabilities.
Speech Recognition Supports Multiple Languages
W
openai
4.6M
4,321
Whisper Large V3 Turbo
MIT
Whisper is a state-of-the-art automatic speech recognition (ASR) and speech translation model developed by OpenAI, trained on over 5 million hours of labeled data, demonstrating strong generalization capabilities in zero-shot settings.
Speech Recognition
Transformers Supports Multiple Languages

W
openai
4.0M
2,317
Wav2vec2 Large Xlsr 53 Russian
Apache-2.0
A Russian speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampled audio input
Speech Recognition Other
W
jonatasgrosman
3.9M
54
Wav2vec2 Large Xlsr 53 Chinese Zh Cn
Apache-2.0
A Chinese speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampling rate audio input.
Speech Recognition Chinese
W
jonatasgrosman
3.8M
110
Wav2vec2 Large Xlsr 53 Dutch
Apache-2.0
A Dutch speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, trained on the Common Voice and CSS10 datasets, supporting 16kHz audio input.
Speech Recognition Other
W
jonatasgrosman
3.0M
12
Wav2vec2 Large Xlsr 53 Japanese
Apache-2.0
Japanese speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampling rate audio input
Speech Recognition Japanese
W
jonatasgrosman
2.9M
33
Mms 300m 1130 Forced Aligner
A text-to-audio forced alignment tool based on Hugging Face pre-trained models, supporting multiple languages with high memory efficiency
Speech Recognition
Transformers Supports Multiple Languages

M
MahmoudAshraf
2.5M
50
Wav2vec2 Large Xlsr 53 Arabic
Apache-2.0
Arabic speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained on Common Voice and Arabic speech corpus
Speech Recognition Arabic
W
jonatasgrosman
2.3M
37
Featured Recommended AI Models