Model Overview
Model Features
Model Capabilities
Use Cases
đ longt5-large-16384-pubmed-3k_steps
This is an unofficial checkpoint of the LongT5 model fine-tuned on the pubmed summarization dataset, aiming to achieve efficient text summarization for long medical texts.
đ Quick Start
Google's LongT5: Efficient Text-To-Text Transformer for Long Sequences was introduced as an extension of the successful T5 model.
This is an unofficial longt5-large-16384-pubmed-3k_steps checkpoint. It's a large configuration of the LongT5 model with transient-global
attention, fine-tuned on the pubmed summarization dataset for 3,000 training steps. Since the model wasn't trained until convergence, further fine-tuning might be worthwhile.
⨠Features
- Based on the LongT5 architecture, suitable for long sequence text processing.
- Fine-tuned on the pubmed summarization dataset, targeted at medical text summarization.
đĻ Installation
No installation steps were provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
LONG_ARTICLE = """"anxiety affects quality of life in those living
with parkinson 's disease ( pd ) more so than
overall cognitive status , motor deficits , apathy
, and depression [ 13 ] . although anxiety and
depression are often related and coexist in pd
patients , recent research suggests that anxiety
rather than depression is the most prominent and
prevalent mood disorder in pd [ 5 , 6 ] . yet ,
our current understanding of anxiety and its
impact on cognition in pd , as well as its neural
basis and best treatment practices , remains
meager and lags far behind that of depression .
overall , neuropsychiatric symptoms in pd have
been shown to be negatively associated with
cognitive performance . for example , higher
depression scores have been correlated with lower
scores on the mini - mental state exam ( mmse ) [
8 , 9 ] as well as tests of memory and executive
functions ( e.g. , attention ) [ 1014 ] . likewise
, apathy and anhedonia in pd patients have been
associated with executive dysfunction [ 10 , 1523
] . however , few studies have specifically
investigated the relationship between anxiety and
cognition in pd . one study showed a strong
negative relationship between anxiety ( both state
and trait ) and overall cognitive performance (
measured by the total of the repeatable battery
for the assessment of neuropsychological status
index ) within a sample of 27 pd patients .
furthermore , trait anxiety was negatively
associated with each of the cognitive domains
assessed by the rbans ( i.e. , immediate memory ,
visuospatial construction , language , attention ,
and delayed memory ) . two further studies have
examined whether anxiety differentially affects
cognition in patients with left - sided dominant
pd ( lpd ) versus right - sided dominant pd ( rpd
) ; however , their findings were inconsistent .
the first study found that working memory
performance was worse in lpd patients with anxiety
compared to rpd patients with anxiety , whereas
the second study reported that , in lpd , apathy
but not anxiety was associated with performance on
nonverbally mediated executive functions and
visuospatial tasks ( e.g. , tmt - b , wms - iii
spatial span ) , while in rpd , anxiety but not
apathy significantly correlated with performance
on verbally mediated tasks ( e.g. , clock reading
test and boston naming test ) . furthermore ,
anxiety was significantly correlated with
neuropsychological measures of attention and
executive and visuospatial functions . taken
together , it is evident that there are limited
and inconsistent findings describing the
relationship between anxiety and cognition in pd
and more specifically how anxiety might influence
particular domains of cognition such as attention
and memory and executive functioning . it is also
striking that , to date , no study has examined
the influence of anxiety on cognition in pd by
directly comparing groups of pd patients with and
without anxiety while excluding depression . given
that research on healthy young adults suggests
that anxiety reduces processing capacity and
impairs processing efficiency , especially in the
central executive and attentional systems of
working memory [ 26 , 27 ] , we hypothesized that
pd patients with anxiety would show impairments in
attentional set - shifting and working memory
compared to pd patients without anxiety .
furthermore , since previous work , albeit limited
, has focused on the influence of symptom
laterality on anxiety and cognition , we also
explored this relationship . seventeen pd patients
with anxiety and thirty - three pd patients
without anxiety were included in this study ( see
table 1 ) . the cross - sectional data from these
participants was taken from a patient database
that has been compiled over the past 8 years (
since 2008 ) at the parkinson 's disease research
clinic at the brain and mind centre , university
of sydney . inclusion criteria involved a
diagnosis of idiopathic pd according to the united
kingdom parkinson 's disease society brain bank
criteria and were confirmed by a neurologist (
sjgl ) . patients also had to have an adequate
proficiency in english and have completed a full
neuropsychological assessment . ten patients in
this study ( 5 pd with anxiety ; 5 pd without
anxiety ) were taking psychotropic drugs ( i.e. ,
benzodiazepine or selective serotonin reuptake
inhibitor ) . patients were also excluded if they
had other neurological disorders , psychiatric
disorders other than affective disorders ( such as
anxiety ) , or if they reported a score greater
than six on the depression subscale of the
hospital anxiety and depression scale ( hads ) .
thus , all participants who scored within a
depressed ( hads - d > 6 ) range were excluded
from this study , in attempt to examine a refined
sample of pd patients with and without anxiety in
order to determine the independent effect of
anxiety on cognition . this research was approved
by the human research ethics committee of the
university of sydney , and written informed
consent was obtained from all participants . self
- reported hads was used to assess anxiety in pd
and has been previously shown to be a useful
measure of clinical anxiety in pd . a cut - off
score of > 8 on the anxiety subscale of the hads (
hads - a ) was used to identify pd cases with
anxiety ( pda+ ) , while a cut - off score of < 6
on the hads - a was used to identify pd cases
without anxiety ( pda ) . this criterion was more
stringent than usual ( > 7 cut - off score ) , in
effort to create distinct patient groups . the
neurological evaluation rated participants
according to hoehn and yahr ( h&y ) stages and
assessed their motor symptoms using part iii of
the revised mds task force unified parkinson 's
disease rating scale ( updrs ) . in a similar way
this was determined by calculating a total left
and right score from rigidity items 3035 ,
voluntary movement items 3643 , and tremor items
5057 from the mds - updrs part iii ( see table 1 )
. processing speed was assessed using the trail
making test , part a ( tmt - a , z - score ) .
attentional set - shifting was measured using the
trail making test , part b ( tmt - b , z - score )
. working memory was assessed using the digit span
forward and backward subtest of the wechsler
memory scale - iii ( raw scores ) . language was
assessed with semantic and phonemic verbal fluency
via the controlled oral word associated test (
cowat animals and letters , z - score ) . the
ability to retain learned verbal memory was
assessed using the logical memory subtest from the
wechsler memory scale - iii ( lm - i z - score ,
lm - ii z - score , % lm retention z - score ) .
the mini - mental state examination ( mmse )
demographic , clinical , and neuropsychological
variables were compared between the two groups
with the independent t - test or mann whitney u
test , depending on whether the variable met
parametric assumptions . chi - square tests were
used to examine gender and symptom laterality
differences between groups . all analyses employed
an alpha level of p < 0.05 and were two - tailed .
spearman correlations were performed separately in
each group to examine associations between anxiety
and/or depression ratings and cognitive functions
. as expected , the pda+ group reported
significant greater levels of anxiety on the hads
- a ( u = 0 , p < 0.001 ) and higher total score
on the hads ( u = 1 , p < 0.001 ) compared to the
pda group ( table 1 ) . groups were matched in age
( t(48 ) = 1.31 , p = 0.20 ) , disease duration (
u = 259 , p = 0.66 ) , updrs - iii score ( u =
250.5 , p = 0.65 ) , h&y ( u = 245 , p = 0.43 ) ,
ledd ( u = 159.5 , p = 0.80 ) , and depression (
hads - d ) ( u = 190.5 , p = 0.06 ) . additionally
, all groups were matched in the distribution of
gender ( = 0.098 , p = 0.75 ) and side - affected
( = 0.765 , p = 0.38 ) . there were no group
differences for tmt - a performance ( u = 256 , p
= 0.62 ) ( table 2 ) ; however , the pda+ group
had worse performance on the trail making test
part b ( t(46 ) = 2.03 , p = 0.048 ) compared to
the pda group ( figure 1 ) . the pda+ group also
demonstrated significantly worse performance on
the digit span forward subtest ( t(48 ) = 2.22 , p
= 0.031 ) and backward subtest ( u = 190.5 , p =
0.016 ) compared to the pda group ( figures 2(a )
and 2(b ) ) . neither semantic verbal fluency (
t(47 ) = 0.70 , p = 0.49 ) nor phonemic verbal
fluency ( t(47 ) = 0.39 , p = 0.70 ) differed
between groups . logical memory i immediate recall
test ( u = 176 , p = 0.059 ) showed a trend that
the pda+ group had worse new verbal learning and
immediate recall abilities than the pda group .
however , logical memory ii test performance ( u =
219 , p = 0.204 ) and logical memory % retention (
u = 242.5 , p = 0.434 ) did not differ between
groups . there were also no differences between
groups in global cognition ( mmse ) ( u = 222.5 ,
p = 0.23 ) . participants were split into lpd and
rpd , and then further group differences were
examined between pda+ and pda. importantly , the
groups remained matched in age , disease duration
, updrs - iii , dde , h&y stage , and depression
but remained significantly different on self -
reported anxiety . lpda+ demonstrated worse
performance on the digit span forward test ( t(19
) = 2.29 , p = 0.033 ) compared to lpda , whereas
rpda+ demonstrated worse performance on the digit
span backward test ( u = 36.5 , p = 0.006 ) , lm -
i immediate recall ( u = 37.5 , p = 0.008 ) , and
lm - ii ( u = 45.0 , p = 0.021 ) but not lm %
retention ( u = 75.5 , p = 0.39 ) compared to
rpda. this study is the first to directly compare
cognition between pd patients with and without
anxiety . the findings confirmed our hypothesis
that anxiety negatively influences attentional set
- shifting and working memory in pd . more
specifically , we found that pd patients with
anxiety were more impaired on the trail making
test part b which assessed attentional set -
shifting , on both digit span tests which assessed
working memory and attention , and to a lesser
extent on the logical memory test which assessed
memory and new verbal learning compared to pd
patients without anxiety . taken together , these
findings suggest that anxiety in pd may reduce
processing capacity and impair processing
efficiency , especially in the central executive
and attentional systems of working memory in a
similar way as seen in young healthy adults [ 26 ,
27 ] . although the neurobiology of anxiety in pd
remains unknown , many researchers have postulated
that anxiety disorders are related to
neurochemical changes that occur during the early
, premotor stages of pd - related degeneration [
37 , 38 ] such as nigrostriatal dopamine depletion
, as well as cell loss within serotonergic and
noradrenergic brainstem nuclei ( i.e. , raphe
nuclei and locus coeruleus , resp . , which
provide massive inputs to corticolimbic regions )
. over time , chronic dysregulation of
adr
đ Documentation
Results and Fine-tuning Details
The fine-tuned model achieves the following results on the evaluation set using beam_search=3
and without any specific calibration of generation parameters are presented below, altogether with the results from the original paper (the original scores are higher, very likely due to a higher number of training steps).
Metric | Score | Score (original paper) |
---|---|---|
Rouge-1 | 47.44 | 49.98 |
Rouge-2 | 22.68 | 24.69 |
Rouge-L | 29.83 | x |
Rouge-Lsum | 43.13 | 46.46 |
The training parameters follow the ones specified in the paper. We accumulated batch size to 128 examples and used the Adafactor
optimizer with a constant learning rate of 0.001
. The full training hyper-parameters and logs can be found via the following W&B run. The model was trained using the HuggingFace's trainer.
The only specific adjustment made for the training was dropping very short input articles (less than 16 words (a bit of mistake, should be less than 16 tokens)) as these sequences do not contribute to gradient creation in the transient-global attention, which resulted in training crashes when DDP was used.
đ License
This project is licensed under the apache-2.0
license.






