PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Automated scoring for divergent thinking seeks to overcome a key obstacle to creativity measurement: the effort, cost, and reliability of scoring open-ended tests. For a common test of divergent thinking, the Alternate Uses Task (AUT), the primary automated approach casts the problem as a semantic distance between a prompt and the resulting idea in a text model. This work presents an alternative approach that greatly surpasses the performance of the best existing semantic distance approaches. Our system fine-tunes deep neural network-based large-language models (LLMs) on human-judged responses. Trained and evaluated against one of the largest collections of human-judged AUT responses, with 27 thousand responses collected from nine past studies, our fine-tuned large-language-models achieved up to r = .81 correlation with human raters, greatly surpassing current systems (r = .12-.26). Further, learning transfers well to new test items and the approach is still robust with small numbers of training labels; in some cases, without any training at all. This work also suggests a limit to the underlying assumptions of the semantic distance model, showing that a purely semantic approach that uses the stronger language representation of LLMs, while still improving on existing systems, does not achieve comparable improvements to our fine-tuned system. The increase in performance can support stronger applications and interventions in divergent thinking and opens the space of automated divergent thinking scoring to new areas for improving and understanding this branch of methods.

 !"#$%&#'
Beyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves
with Large Language Models
(&)%&*+$%*%&,+) &)-+$$./0& ,
1).#"1
,1).#"#02&)
-1).#"#(&
Author Note
(&)%&* 0)344#%!5#(466667666,7869:7,,:6
#)#!%%#%(0)&%$)0#$!&!!))!#(&)%&*+& #"
)&%00#!)&!"# &#%%+1).#"1+88851&)1+1+
:6,6:+)&)5
 &$35#(&)%&*;!5!
,

Abstract
# &!)%#("#!1(0*()*)##1%# &*.#)&%$#%&1.
&) 30""#+%#)+&!$&$.#")%#(#7!!))5<#&%# #)#"
!1(0*(+0$&))&)*=>+0 &.&# &!&#&%0%&))0
#$ &)&) &%!)&%?&# &!0)$(!&&2 #!$50)
?#*))&&$&1&#&%00&(&$.)&)))0"# &%#"0)2)(
) &%!)&%&#&%0)5).) "7)!&$?#*7&)!$&(7$&(&(
#!$)=)>#0 &7@!(!)#))5&!&!1&$&!&(&)##"0$&()
%#$$%#)#"0 &7@!(!)#))+?0,A0#)&!)#))%#$$%!"# 
&))!)+#"7!$&(7$&(&(7 #!$)&%01!#r B5:%#$&#?00 &
&)+(&$.)&))(%).) )=r B5,5,C>5<0+$&(&)")?$$#?
) )&!0&#&%0))$$#)?0) &$$ )#"&($&$)D)# %&))+
?0#&.&(&&$$50)?#*&$)#)(())&$ #0!$.(&)) #)#"0
) &%!)&% #!$+)0#?(0&&$.) &%&#&%00&))0)#(
$&(&()&##")+?0$)$$ #1(#2)().) )+!#)#&%01
%# &&$ #1 )##"7!).) 50%&)"# &%%&)#
)#(&$%&#)&!1#)!1(0*(&!#)0)&%#"&# &!
!1(0*()%#(#?&&)"# #1(&!!)&!(0)&%0#" 0#!)5
Keywords:!1(0*(D&$&)))D$&(7$&(&( #!$)D&# &!
)%#(
-

Introduction
&!#&$$.+!1(0*()&%00&))&!. &) 
%0&$$()5/.0&+))#"!1(0*(&"# $&!&#7!!?&.+
?0%0%&))0 +""#+&!%#)#" &) 5%&!1&%)!1(0*(
)&%0+0#?1+0&1"#!0&&# &! 0#!)%&$&$.)%#&$&)#.#"
!1(0*(&)*+0$&))&)*=D/&.EF#0)#+,6,D &)E
&+,6GD &)&$5+,6,6>50) 0#!)%&&$H#0&&$#.#"27
( #!$)#%&$%$&) &%!)&%#$&#)0)??#!)&)& &)&$
!)&%&!)0&!)&%&)&#2."#0#?!1(&!&)"# &# 5$(&
"&#"0)&#&%0)0&)""%1$.unsupervised+0&!#)#I$&(
"# &(2& $)5
#?1+0&$ &#)#0%) &%!)&%&#&%0#&# &!
!1(0*()%#(5<)+0)%"%) &% #!$)0&&)!0&1
#"# ! #)#0&$%&#)?0&&$$&(&(#%))(=&(&$5+,6:D
&(&$5+,68>5?&!1&% )&%$&))#"!7&$?#*7&)! #!$)0&1
)0#?&%#))$.)#(&! ##)(&)#"0$&#)0)??#!%#%)
=1$&$5+,6:D&$5+,68D&!"#!&$5+,6:D&""$&$5+,6,6D&)?&&$5+
,6ADJ&(&$5+,68>5<0+0) &% #!$)%$.)!&# &!)%#(#$.
!)&!2&)&)#"!!?#!)+?0&)? #!$)&%%#"#0
%# $2)#"%#2?0?#!)&)!#(0#%# %&&)%#&))&(5
0)&+?! #)&&)("%& #1 "# &%#12)(
)%#( 0#!)."7(&(&(&(#!$)=)>K&%$&))#"&$
?#*7&)!&#&%0)# #!$(2K#)%##(&$.#")#))&)!#
$&!2& $)5&$)# &)0&$.#"0)&#&%0#)%&$#?# )+0
""%#"&()H# #!$)(0+&!0)(0#")?0#&."7(5
0?&#&%00&?#!%0)supervised+0&)(1&)#"7
#&)#$&"# +#?&!(&$#!%#)"#17"#7))5
1)!$&(?&)1#)$.&$!#0./%H&*&$5=,6,,>+?0#)!
"&((#2&%)&$"# &#"#)?0 &%0$&(%$&))")5
%#&)+#&#&%0))"7(+?0&$?#*7&)! #!$0&0&)&$&!.
&!0)%&)+&&(&(&(#!$&!#(&$2))"0&!#&)*7
)%"%!&&5!#()#+&%$&))"%&$!"# &)#("#!&#&$*#?$!(#"
$&(&($&(0#?#0$&#)0?&&!#)%0&)0&
?&)#)&!& $7&@!( #")#(&$.5
0 #1 ))!0&&)#($&#10%)&7#"707&
&#&%0)&!#1!)1!%)(()(0&0.%&"0 #1!?0&!!#&$
&(!&&+$&( #!$)+&! #&##"0&#&%05))%0&)0#)&$!
G

0K9=&""$&$5+,6,6>&!7-=/#?&$5+,6,>K"$%0%)
&#&%0)#027&)!!# &)+&)*))%0&)) &&$.))=#%0&$5+,6->+
%# #))&)#(=# $&$5+,6>+&!I)#&)?(=&@*&&$5+
,6C>5)&$)##!%?%0&$$()#0)!.#"&# &!!1(0*(
)%#(5<##+!&$?#* #!$)+?0 $$#)#1$$#)#"&& )+
I#1$&#&%0)#?&!2$&(0%# $2&!&%!&$$#(%=/&!#
&&$5+,6,6D(&$5+,68D#@ &&$5+,6,,>5!!#&$$.+0)#"&(
!&&I)$&(+0(07I&$. 7$1$(#!0+&! &.I #!&&7)0&(
%##!&#?00!1(0*( &) %# .5
0))!.&)*)0"#$$#?()&%0I)#))(performance, robustness,
&!transferability3
5 #?!#)&""%"# &%"#&# &!)%#(L
,5 0&)0"# &%""%#"%&))&(!&&)HL
-5 #??$$!#&!)&)"#)# )L
Background
Divergent Thinking and the Alternate Uses Task
))#"!1(0*(=>!&&%*#0%#(1&)))) ""#)0&
 (!0&$.,60%.=$%*&$5+,6,,>5/((?0$"#!=896>+&$#(
?00) &$?#*).#&%=8CC+8:6D) +,66C"#&1?>&!%#=88D
,6->+))0&1%# 0 #)#$& 0#!#"%&1.&)))) =.!&$5+
,68>#0).%0#!%&#&$)()&!)&%05M&"?$#(!&$=& #!&$5+
,669D%#&$5+,66D#&%+8A,DN&%%&#&$5+,69>&! &7&&$.%)!)= +
,66:D&!7?&$.&$5+,6,,>0&1#1!!1!%#"0!%1#?5))&
#")!#!%%&1#&$=%#E%&+,6,>&!&)#  ))!"#("!
!"%&#5
0&1&#).)#")))!))%0&)#)I%)+)&%)+
 $&)+&$)%+#$ 7&#+&()+%#)%#+%
# $#+&&()+&!)*(M)#)=%#&$5+,6CD#&%+8CCD
&$$&%0E#(&+8C9>5#0#7!!&#"0)&)*)+0& &.!""
0#!)#")%#(+0%#1#&$!%)%# )"$%.= #"#!%!
)#))>+"$2$.=!1).#")#))>+#(&$.=)&$+%# #)#))>+&!
$&#&#=$1$#"$(&%&!!&$>5#&%))#"&10*(=>)#
2& $#"&%&1.&)))) 0&$!)#0%$)&!)%#"))?0
)1&$&)*)?01&.(.#"&)*)%)&(&!=%&+,6,,>5<#0&)
)1&$!%&!)+0&)0 #)#$&.#"))!)&%0+0#(0#
%))&$.0)#()=%#&$5+,6C>5+&%&)&&)*!#(&))"#
9

1.!&.#@%)5/)!)0)))&$)%+0&))# 1&&#) )#"2$%
)%#)&!)%"%# ))!5<#2& $+001&$"# #"
="!#.#&%&)0)&$)))>))0# )%&&!%&!#&!#2+
?0&)&$$&%0&!#(&O))=8C9>))?)&&!*"5*#0&)*)+
0&))%#! &&$$.)(0 &@!()$%$.502)%#))0)
#1$ 0#!)0&&)%%))"$I%*$.)%#()#))#!%!"#))5
Automated AUT Scoring
# &#&$&#&%0)#)%#(%&1.0&12)!"##1""..&)
=<#0 &E#$+,6,,D&$)+8A6D&$)EH$+8C:>50 #!)!.#"
&# &!)%#(+0#?1+0&))##)& 0#!*#?&)& &%&$.))
=D?)&$5+886D&!&E &)+88A>5+*#?&$.7#&)=I "#
!2(>+)& 0#!"# "# &#1&$)&%00&"!)$&#)0)??#!)
.&&$.H(0%##%%%2)5)&).$!(& 7!#%  &2+& &2
#" %#)?0&%0P%#$ O))&I?#!+&!&%0P#?O))&
!#% 50+&$)& &0 &%&$#%))+($&&$%# #)#+#!%0
 7!#%  &2#&)&#?0"?! )#)0&#?#!50)
!%#&$$#?)!#% )#)!&)& 2#"nK)&$$.&"?0!!K!)
! )#)&00&0#)&!)#")&) %#)5%# ##"0)!%# #)#)&
&2#" 7! )#  )5"$&(0%#7#%%%#"$&#)0)+0
&")?#!)&)!.! )#)+?0%0)0$&(0#0##!#"
&("#0#)?#!)50! )#)&&%$!&)&%+?0) $&?#!)&
%$#)#(0& &)&$+(# %))50 #&?#*.&!&&! &)
=88A>?0%0#$&H!).%0#$#(.#!0&0)&#&%0&#2 &!0#?
0 &)0 )$1)&)"# &#"# &$&1$.)$&(&(5*)+
&%*#?$!()&"? &")?#!)+)0 #&!)+$&! #!$#"$&
#%)&!%#%)5&!!#+&&).#)7&)!)#"?&?&) &!&1&$&$#0
)&%0%# .&#2 &$.,6.&)&(#=)0344$)&5%#$#&!#5!4>+&!0
&%%))$.#"0) 0#!#).%0#$#(%&$)&%0)0$!% )#$&.5
0%)&7#"707&&# &!)%#(&#&%0)&%$1)#"&!
) $& #!$)5?$$7&! #!$)& &#") &%)&$&(&(+?0%0) $&
%#%)0#$!%#%$&&%0#0+?0$!)@#%#%)&"0&&5%)&!
#2)(2)+0!)&%?%#%)%&&$)#!&)& &)#"#)1!
$&!))?0 50)*)# &)0*(0&)!1(+))(+#
!)@#"# &(1# 3&(#&$0&&$()&$.?0!)&%?0&) &% #!$$*
5<#&# &!)%#(+&%0# &!)#)&#@%!#1%#)0) &%
)&%50%#)#"0&($)0)!#%&$%$&0!)&%?0)1%#)+#0
) &%!)&%50))0!$.(&#&%0&*.%$.$H!&# &!
).) )$*&1.%#(=+(&)%&*E &)+,6,6D
0)344#)%#(5!5!>&! )=/&.EF#0)#+,6,D0344) !)5?$5)5!4>5
C

&! )!"" )#"0#?2)&#%))!+0#?0&))&0&!$!+0
&(2))!#"# 0 #!$)+&!0) &%)&%&#&%05
%+00&1)1&$) $&.)#" #!$(&#&%0)+%$!(
#&$)%& &%&$.))=#" &+,6->+#7(&1&2<&%#H&#
=E(+888>+&!&%0$$$#%&#=/$&$5+,66->5#%$.+&%$&))
#"P?#! !!( #!$)O=)>0&1)!0)& !?0&!""&$)5
)!##&%*(%##%%%)&!#% &!+)&!+$##*&%#7#%%%)?0
&?!#?#"&?#!O))&(&250)&$$#?)&(#)%&$&!!%#)"# $&(
!#% )5#!,%&!&(&)&!%##$ +)*(0#!%&&(
?#!"# )%#2#1%71)&=*#$#1+0+&$5+,6->&!#!%!# H&#)
"#&(=*#$#1+)*1+&$5+,6->5$#)! &2"&%#H&# 0#!)?0&
?#!7?#!%##%%% &2&)!#$#%&$%#2?!#?)=(#&$5+,6G>5
"&)2#!%!)7?#!)&#)#"?#!)+&$$#?(1#)$.)?#!)#
)!) #!$+ &*()).&%%$&#)0)&!!##) &%#)
=/#@&#?)*&$5+,6A>50)+&$)#%&  #%# #$&%#)0& #!$)7
&!# &))1%##&#"(&$H!($)02)501&#) #!$)$1&(!.
&! )"&$$?00"& $.=/&.EF#0)#,6,D &)&$5+,6,6>5
1&$$+0)&#&%0)0&1)#(0) $&(#&$)?01&.( 0#!)50)&+
semantic model))!#"#0(&$#!.#" #!$(&#&%0)0&&$$#?)$&$.
%# &&$!)&%)??#!)&)&)&!7"#) $&)#") &%)?0#)
?#!)50)&#&%0#&# &!)%#())1)!+&)?#!!)&%%&
%&$%$&!?0#&.!$.(*#?$!(#"0&)*#)#))5%?#*0&)&$)#
2$#!supervised approaches5)1#)$. #!+/%H&*&$5=,6,,>##*&"&
((&!%$&))"%&#&#&%0+&( () !!()?0#0"&)&!
&(!%1 #!$))(%$&))"))%0&)&!# <#))&!Q/##)5/&.&!
F#0)#=,6,>&$)#"#!0&&?) &%)%#)%#$! #1!.&$&"&%#
?(0(50))&%0)!!#)0#$ &)&)1)!$&(&$%&#+
0"!(0&$&"&%#?(0( #1!01&$!.#")%#))(())0&(
&)!##*#?$!( &.#1!& %00(0%$("#2$&(#(&$.)%#)
0&) &%!)&%&$#5
0))!.&#&%0)&# &!)%#(!%$.&))1)!$&(5
!$.( ))0)& +0#(00 0#!)!"""# /%H&*&$5O)=,6,,>
&#&%05#?#*0&))!0&&!( #"fully supervised learning withfeature
engineering =&$5,6,>+?0%02&%))&$"# &#)#))&!&)
%$&))")?00&"# &#5?#*"#$$#?)0&&!( #"pre-train and finetune =
&$5+,6,>+?0&&$?#* #!$)7&!#&(&!&$#"(&$!&&&!)0
&!#&&)*7)%"%#@%1?0&($&$50))!.+?)&! #!$)
"# 9=&""$&$5+,6,6>&!7-=/#?&$5+,6,6>5
A

Recent Innovations in Natural Language Processing
%.&)0&1)&(&!&$#"%0&(&&$$&(&(#%))(&#&%0)5
 #1 )#!&$?#*)0 &%0$&(%# .0&1$#%*!?&.)
# #!$2?0 #&%&!%# $2.5 &@##1&#2 #!$()0
transformer &%0%,?0%0$H)&%#%%&$$!attention=&)?&&$5+,6A>5
# &*)%# &#&$$.&%&$"#&&)"#  #!$#%#)!&$#()I%
#"2+.)$%(&)#"0)I%0&& #) #&50)&$$#?)&($&(
#!$)##@)?#!)+0%# $2%#2)?0%00#)?#!)#%%5?##&$
&$.&)"# 7&)!&%0%)?/=1$&$5+,6:>&!=&!"#!&$5+
,6:>5/?&)&$&! &* #!$&!#0&%0%)))I$.# H!# #1!
#%$!(#/&=&$5+,68>+Q=J&(&$5+,68>+&!9=&""$&$5+
,6,6>5
&&$$&(&(#%))(+&)"# 7&)! #!$)K)#  )%&$$!&(
&(&(#!$)=)>K(&$$.#"# )#)&!&!&)*)+&!#".$&(
&()=&(&$5+,6:D&(&$5+,68>50)+&#%))%&$$!transfer learning)
#"&%%!?0%00 #!$)&&!#2&#!&.I&)#"2&!&
$&)!$%$."#))#0&#0)&%0)&!&%#)%&$!#0 #!$#"
$&(&(&00&$!(5
.%&$$.+& #!$!(#)&"# #")1)!$&(?0"7(?0&)*7
)%"%2)&!&)#&(1#$ 5"7(+)# #&$$0&$?#*$&.)
&"#H)#0 #!$%&%#(&!"#)%"%&)*)0&!1!&$
)&%0)%0##)+2%2& $))&&)*7)%"%#@%1&00&(%2)5
)# %&))+&&!!#&$%$&))"%&#$&.)&!!#0!#"0?#*5
0%$#))%##$$&.#)%#(& #()&!&!&&$$&(&(#%))(&)*)
&.) &%2&$) $&.=>+?0%0)*)#$&$0) &%!)&%?
)%)#) $.$&$)?00?#!"")%) &0)& 0(50)&)*
1#$1)!)&%?$&(&(&)+ &*(&))$)#"%#)) $&.&$&
) &% #!$5 )&!1.%0=,68>0&1! #)&!0&&*(0) $&.#"
&?!$.( !!()"# &)"#  #!$)$&!)###"# &%#)%0&)*)
&!) $&)#")#&)*)?#$!$*$.&$)#)"""# ##"# &%5J+
1#&)*)+0"# &%#""7!)#?(0)0"# &%#")
=5(5+&""$&$5+,6,6>5
0))!.+&)&!&! #!$)"7!?02& $)#"0 &%&1.
@!( )#(&&))#"#(&$.5$&2 ))0#?0&0&? #!$
?0#"7(!#)0&1)# ))#"#(&$.##"0#2+) #1!?0
&(5
:

Data
0!&&)!0))!.?&)%# $!"# )1&$#)!)?0&1&$&$ 7
$1$0 &@!( )#")#))&)?$$&)$)0!%$.7%#$$%!!&&?0
$ &.7&(!&%&)50"!&!!&&0))!.)#&$$.0$&()%#$$%#
#"0 &7@!(! 7$1$)#))%# $!+?0,A+,A )#))"# ,+6-8
&%&)&%#))!&&))5
0#11?#"0!&&)))&)"#$$#?)+#(&H!.0!""#&%00&)
)!"##(5
5 &$:=/&.&$5+,6:>30)!&&))!# )"#box&!rope+
&! )!#A&!$&%&)+)$(nB,+8:#&$)#))5)#))
?@!(!."#&)+?0&&1&(!&!# &%$&))%#$&#%#""%
=ICC2k>#"5:5
,5 ),=/&.E$1&+,6,>30)!&&))!&)($# +brick+?0--%#$$(7
&(!&!$)5)#))?@!(!.0&)=nB+:6A+ICC2kB5A,>5,
-5 !#!,6= &)&$5+,6,6>30)!&&)%#)))#"6# )7book, bottle,
brick, fork, pants, rope, shoe, shovel, table, tireK&! )!#8,&%&)+&!
%# ))9+G-9#&$&()5?&))%#!.0&)=nB9+G-9+ICC2kB5:9>5
G5 0 )$=#"$%0#&$5+,6C>3 0)!&&%# ))C-:&%&)&!?#
# )+"#paper clip&!brick5<#@!()&!0)#))=nB-+:G-+,5CA>5R-
95 #)"30)!&&))"# &#(#()!.!1$#(&!1(0*()"#
$ &.7&(!)!)50!&&)!0))$$(7%#%!!&&"# &
###"0 &)+?0:# )&! )!#-:9&%&)3hat, toothbrush,
bottle, spoon, pencil, ball, lightbulb&!sock&!@!(!.G&)=nB,+8,G+ICC2k
B5A->5
C5 #)30)!&&%#)#!)#&$#1)##"0motesf!&&+?0-9&%&)
&!0)& # )+&)?$$&)backpack&!shoe5=nB--8+ICC2kB5:>5
&&)&1&$&$&0)344#)"5#4(HG"%4
,
-
&&)&1&$&$&0)344%#)1&%.5 5!40&!$4,884A,C
8

A5 )&$6:=$1&&$5+,66:>30))&%0)!!!1(0*(0#(0)2&)*)+
%$!(%#)I%)+)&%)+&!&$&))50))!.))0&$&))
&)*)+?0%0&)*!,G&%&)"#%&1))"#&brick&!&knife50@!()
&!0#(&$.#")#))+?0=nB-+G,9+ICC2kB5G:>5G
:5 )A=$1&E/&.+,6A>30)!&&+G,%#$$()!)?&! )!?#
# )3box&!rope5)#))?@!(!.0&)=nB,+,A,+ICC2k
B5CA>5,
85 ) #68=$1&&$5+,68>3<&$$.+0)!&&)+,6,%#$$(7&(!)!)?
&)*!#!1$#&$&))"#0&)*)3brick+knife+&!box50#(&(
)!.+-&%&)? #1!"#$#?(&( D0))!.))&$$!&&
&1&$&$5)#))?@!(!."#&)=nBG+688+ICC2kB5C8>5
Table 1
Counts of Rated AUT Responses Prior to De-duplication
Dat
aset
Respo
nses
#
)
8C-
), :6A
)
A
,-A,
&$
:
,8:
#
)"
,8,G
)&$
6:
-G,9
0 )$ -:G-
)
#68
G688
!#!,
6
9G86
&$$!&&))+!1!&$)#))?%#!!. $$0 &&)5&)
)&%0+0)#%))1#$1! $$&)?0#)%#!0)#))!1!&$$.=&))&$5+
,6:D$1&+,6>+)$%! &2 &$))=/!*&$5+,6-D0&?+,6,D$1&+,6>#
G
&&)&1&$&$)&0)344#)"5#48!2A5
6

&)&#&$)#))=%&&$5+,6,D%#E&H+88,D$1&+,66:>5%#(#(&$.
%&%#)!!&) ##"&normative&)*0&&#@%1#50&0) &))0&"
#(0&)&&)*!+0.(&$$. #1#?&!&%#)))##(&$.&()+1"
?$$7&!!1!&$&) &.!""#0!""%?0(0$.&! #!&$.
#(&$)#))50))!.)!0 &#" $$&()"#&(#!0@!( #"
&%0)#)O)#(&$.5&&?&)#!!#0&)655&&))? &!#&
"17#)%&$"0.!!##(&$$.)#+)%&$($&$.?0   &!
&2  )%#5
&& )?!!$%&!+)##$.##"&%0# 4)#)&?&))1!5
#&$+A+69)#))? #1!"#(&!)#))+,95:S#"0!&&+)$(
&"&$!&&)H#",6+,6,)#))50 #)&!)#)?&)#)&%*&)&
&?(0D0 #)%# #&)0#?&$,5!$%&#?&)!##$.#2&%
!$%&)+)#"#2& $P$!&0#)O&!P &*&0#)O?#%#)!!!$%&)5
Table 2
Most Repeated Responses to Prompt Items
prompt response count
%* &?(0 C8
%* ?&# ,G
%* &?(0 8-
%* !##)# :8
# $ :G
#2 0& AC
%* $!&0#) CA
&$!)&%0?00+!!$%&#?#$!#%))&$.!#+%&)
)%# ##)!""&%&)(10)& )#)5# ))$.#&)
"# &#&#0%# #)#))#0$)%##(&$.=#&%8CC+#&%
8A,>50 #1&#"#2%$!(!$%&)#))0)0&!$%&)(&$.
&!1&&()1)!$&(?0#1&$( %0&#0!$.(#)))#"0
##$5)&"# #"data leakage+?0)# #"0)(!&& &*))?&.#&(!&&+
&!&)#)%0&$$(##!%$)$)%# #$.)"$!)?$.&!#( &%0
$&(=&##E&&.&&+,6,,>50$0)%&))#%))&$.&&$)%
&!1&&(+0))!.& )#"#%) ##0#)))#")$&(0#!)#"
0&!2!(#?)#))#1?# )5#&#0?&.+"?
!!#!7!$%&0%)!.+#"!()?#$!$##*)#(0)#%#"0&
)(0?#$!$$)$))&#0(&$H&$.#"0&#&%05


#!0)%#)=55+0 &@!(!#(&$.&()>"#&()#))?
&1&(!5<#2& $+&%0&%&0&0&!P&?(0O&)&)"#P%*O?&)&!.
&&$#"0 &@!()+&!0(#!0)!0))!.%# !&$$0#)
@!( )#&)($%#)))%###!7!$%&#50)?&.+&$0#(00
&!)#))!!#&&#&(!&&) #0&#%+&$$#"0&!0 &
@!()?0#&!0#)&!)#))#1!!I&$$.?(0!"# &#&#?0&0
#(&$.#"0&)#)?&)+%&)?&1&(!0#)&()5
Experiments
&&?&)&!"#2 )performance+robustness+&!transferabilit.5
0)2 )&$(#00)&%0I)#)5
Performance: How do LLMs affect performance for automated AUT scoring?
#! 0 &)&(&$"# &%#"!"")%#(&#&%0)+&:67
979"$$.&!# H!&(4%#))71&$!&#4)()$?&))!+?0)1)!$&(
0#!)?&!?0:6S#"0!&&&!1&$&#?&)"# !#9S#"0
!&&)50 &(%#))71&$!&#!&&)##&$$.)!"#)(#())&(&)&0$!7
#)!(&(?0#%# # )(0)(!&&5
Robustness: What is the performance effect of increases in training data size?
1)!$&(I)&(2& $)+0#? &.2& $)L00
 &."# &%1&$&#?&))$"# &$$(#!0+)&$)#?#0 &)(0
#)))#")1)!$&($&1#&(!&&))H5)(0)& )$&))!
&#1+ #!$)?&!?0) &$$)))#"0&(!&&+#)0#?0&1&$&$.
#"&(!&&&""%)0I&$.#"0 #!$5
!!#&$$.+?1&$&!# 7&)!&#&%0)=5+,6,,>?0%0!##)
&."7(5&0+0.2$%$.=55$&($)0>&)*&7&! #!$#)%#0
#(&$.#"&)#)5%# &!# )?0"12& $)%#)#1!!+&)?$$&)
?0#&&$$5
Transferability: How do trained LLMs transfer to unseen prompts?
0&#12 )"#%)#0#?0 #!$$&)#)%#?)#))#"*#?
# )5%#)!(&)"&$.+0))!.&$)#$##*)&0#??$$)%&(&$H
0&)*+#)%#?)#))"#17"#7)# )5
<#0)2 +!&&))$.# 50# ))!0&(
!&&&!)!&&& &$$.2%$)15&(# )%$!!brick, rope, box, knife,
book, table, tire, ball, lightbulb, pencil, shoe, sock,fork+hat, toothbrush+&!backpack5# )
0))?paperclip, spoon+bottle, shovel+&! pants5
,

Methods
0))!.%# &!)1)!$&(?0?#&%0%)K9=&""$&$+
,6,6>&!7-=/#?&$5+,6,6>?0&.%&$)1)!) &% #!$&)$+
)(0 $ &#)"#  &)5&$=,6,6>&!/&.&!F#0)#=,6,>5
!!#&$$.+&)1)!&$%&##"7- #7$?0%&#&%0))
%# &!5
Baseline: Semantic Distance Methods
)&&)$+0%)&7#"707&&# &!)%#( #!$)&&$!+?0%0
)!)&% %)) &%)&%)5%"%&$$.+)%#()&$!"# &1.
%#(=+ &)5&$,6,6>+&! )=/&.EF#0)#+,6,>5
 )%$!)"1&! #!$)+&)?$$&)&) $)%#?0%0&*)0 &
#"&$$"1)%#)=/&.EF#0)#+,6,>50) $)%0#)"##(0+&)
SemDis_MEAN, &))0&0#)O%# !&#&!0)"# 5!!#&$
&& %# !&#)"# /&.&!F#0)#=,6,>&&$)#"#$$#?!3)(2
%$&(?0)#$)?#! #1&$&!&$.( $$%&1&00&&!!1%# #)#
#"?#!)#0&))5
0&1.%#(&)$)&)!#?#*"#  &)&$5=,6,6>&!
))0$#7&)! #!$%# !!0&&5$#)&"# #"?#! !!(
#!$")!)%!.(#&$5=,6G>&!$&)!?0&)#"&! #!$)5
+0-667! )#(&?#!C/7&! #!$))!=(#&$5+,6G>+
?0%0?&)&!#C$$#?#!)"# *!&&!0(&?#!9%#)=&*&$5+
,6>5)?0 )+0&0#7%# !!&& )&"#$$#?!30).)  #1)
"%#?#!)=stoplisting)+%# #))0&))?0& &#"?#!1%#)?(0!.0
$&1 #&%=term weighting>+&!&1#!)&$H()#))0&)&# ?#!
.2%$!(0# ?#!"# )#))5
/#0&)$).) )&&1&$&$#$3 )&0344) !)5?$5)5!&!
&1.%#(&0)344#)%#(5!5!5
Semantic Distance with Large Language Models
&(&(&(#!$)&(&$$.#)%%))"$&##"0#2) &%!)&%
= )E1.%0+,68>5#?1+)#))$#$!!#?)& ) &%!)&%
#!$)###")+$&!"# &!&)#")%)=$&*&&&$5+,6,,D&$5+
,6,D )E1.%0+,68>57&)! !!(%#$!#&$$.$1&(0
)#(&$)&##"$&(&()&&!0&!$(#"0&))?0$
)$$&$$#?("#)1)!)0&!##"&$&# &!!1(0*(
)%#().) )5
-

+ !!(7&)!) &%!)&%)%#)&#!"# 7- !!()
=$&*&&&$5+,6,,>&!%79=&$5+,6,>5&%0#"0) #!$)0&1!""
)H)+&)$)!&$-5#0&0&%0%#"0%79 #!$)#$.%$!0&$"
#"0%#)#!(9 #!$+?0%0)?0.0 #!$& !)97-/0&)0&$"#"0-$$#
&& )0&)& ()(())5
Table 3
Size of Embedding Models Trained on LLMs
Model Model size Embedding size
gpt-text-similarity-ada -66&& ) 6,G
gpt-text-similarity-babbage 5,/&& ) ,6G:
st5-base 6&& ) AC:
st5-large --9&& ) AC:
st5-3b 5,G/&& ) AC:
Fine-Tuned Large Language Models
&(&(&(#!$)&0 &."#%)#"0))!.5+?#&%0%)&
1&$&!39=&""$&$5+,6,6>&!7-=/#?&$5+,6,6>5
T5
9=27#72&)"&)"# >))#"&%0%)#!%!.&""$&$5=,6,6>5
9?&)0#!%#"&)!.%# &(1&#) #1 )#/7$* #!$)5%
)%&#" #1!.$&( #!$)&! #&(2)+))#  )!""%$
#! ?0 #1 )&%# ("# 5&""$&$5=,6,6>%# &!!""
#!$(&#&%0)?0$%##$$("#%# &#&$)#%)&!!&&%#2)+$&)(
#!$)"#0)7"# (&%0%)5
0))!.))0T5-Base #!$+?0%00&),,6 $$#&& ))?#*5
0&0$&($&)!9 #!$)+#$$#&& )+?0%0?#$!$*$.
 #1&)*"# &%?0&(&%#)# $ &#&!&)"&$.5
9)&27#72 #!$+?0%0%&))&$$)#&"# &?02)(1&)+
&!0 #!$#1!)#&)2950) &.) &$$7""#)%#(+?0
0)#(&&%##)I&"!1&&$+#)&!2(&#5
&$.#%#)&# &*0#& D9%#$!(&&)#$$#I.0).$#"
& $"!)!#5#0$))+&""7(+$&)0&0)#))2%!#
 %&$5
9
0$#"0) &)%?&))(()!.&0) &+&)!#)&)&%5
G

#&%%# #!&027#72"# &+&(&!"%)?"# &!#
0"#$$#? $&3
{prefix} {prompt} {response} ,
?00"2)Pautscore:O+0# ))%!&)Tquestion: What is
a surprising use for XU&!0)#)))%!&)Tresponse: YU5<#
&(+0&(#?&)0(#!0)%#+?0#%)##+&)&)(
#"0 5<#"%+0) ?&)!%!&!!&)& 5
 #)&12& $)&)0#?&$G5
Table 4
Example Inputs and Outputs for T5
Input Output
&)%#I)#30&)&))()"#&/)#)3$&.&% &* 956
&)%#I)#30&)&))()"#&)#)3&*0 #"" 59
&)%#I)#30&)&))()"#&)#)3"()(#? 956
&)%#I)#30&)&))()"#&<)#)3)$ 56
GPT-3
7-=&17&!&)"# ->)& #!$"# =/#?&$5+
,6,6>0&#H)2(&#+& (#!%?0&2"#$$#?)&5&(
&$)%0 &7$*2I))# 1&&$.+?0%0?#$! &*"###%$&))"%&#5
#?1+)#))$#!#?0P &O+0&& ?0%0&""%)0#?1&&$
0)& $(#"?#*))500 &&H#+0 #!$#))())2+
&$$#?(#)!&) $&27#72 &&)95
0$&()7- #!$)0&1#A9/&& )+?0$) &$$$&)! #!$)
0&1&#2 &$.-96+5-/+&!C5A/&& )50))!."7)&1)##"&%0
#!$+?0%0&"!#"# ) &$$)#$&()K&)ada+babbage+curie+&!davinci5
7-)#$.&1&$&$0#(0&&!&$%&##(& ("&%=>"# 5
0) &)))&!(0&1)# %#))+#%))(#%%)#&0#)!)1+
$#?(0%# $2.&!%# &#&$!)#"&$.(5
Zero-Shot GPT-3
<&$$.+&#1$H#7)0#&#&%0)%# &!#)00*#?$!(#"&7
-+?0#"7(5N#7)0#)&.#")1)!$&(?0+!)#)(
#2& $)+& #!$%&"))$%$&)))"# &"$.!)%!%#25<#2& $+
9

# (0&)*T0&)0#(&$.#")%2'#&)%&$#"76U+&!& #!$ &."
0&#"0&)*&!0%&=76>5#?1+0)&#&%0!#)""# # 
((=55+%&"$$.?#!(0#0 #!$#(0 #))"$#>#
! ?0&.)#"I)#)&)!)##!.0 #!$5
Results
Overall Performance for Replicating Human Judgements
0#1&$$"# &%#"0$&($&(&( #!$ 0#!)+#&!# H!+
!!$%&!)#))+))!0%#$ #"&$95&)#%#$&#
?00 &7@!(!(#!0&!0 #!$!%#)#!5!!#&$$.+
%#$&#?00 &)#&%0)7!&&)))0#?5"# &%&()"# r B5,#r
B5:5&!!###1&$$%#$&#+& &#"7# %#$&#))#1!!+
pP
rp
n(P)
"#&$$# )P =&$C>5&# %#$&#)$))))1#!""%$)?0
!1!&$# )&!!""# )& $)H)5&# %#$&#)&("# r =
58#5:65
C

Table 5
Overall Performance of Each Model
AL
L
betal1
8
bs12 dod2
0
hms
l
motes
f
motes
p
setal0
8
snb1
7
snbmo0
9
Baseline
semdis-mean 5,6 5,6 5CA
V
5,G- 599 58 756G9W56CGW59AV 756,6W
ocs-main 5,9C 5-8 5A: 5-A 5-CG 5,9A 5--AV 5-,: 58- 5,89
LLM Embeddings
st5-base 5,,- 5GG- 5,,A 5G9 5,A: 5,, 5G, 5,9A 5,89 5,9-
st5-large 5,,A 5G,9 5,8 5-:9 5-68 5,,C 5G6-V 5,G9 5-6A 5,CA
st5-3b 5,6G 5-8- 5,C6 5-G8 5,:6 588 5G99 5,-G 5,:: 5,-
gpt3_emb-ada 5,:9 5-8C 5,:G 5G,8 5-8C 5-:, 5G:6 5-G 5-C8 5,9G
gpt3_emb-
babbage
5A- 5-,6 5,G 5-9, 5,9, 5,G 5-G6V 5,G- 5-8 59
LLM Fine-tuned
t5-base 5A9C 5C6, 598- 5AC 5A,9 5CC 5G:G 5CC6 5C,A 5C99
gpt3-ada 5ACG 5C,A 5C 5A9 5A,G 5C:C 598G 5C8G 5C9 5C:C
gpt3-babbage 5A8, 5A6G 5CA 5A9: 5A,8 5A-6 .755 .723 5C8 5A,A
gpt3-curie 5A8 5AG8 5C9 5A:6 5A,9 5A-, 5CC 5CG: 5CG- 5A-8
gpt3-davinci .813 .762 .712 .802 .730 .801 5AA 5C8A .698 .744
Note3 "# &% &)!&)##$&#51&$$"# &%)!&)+
#0%#$ ))!1!&$!&&))5/))$)+%#!#+& &*!#$!"&%5$$
)$)&)("%&&pX56+2%3VpX569&!WpY5695
&$C)0#?)0"# &%#"&%0#  #!$+&$#(?00 &&!
)&!&!!1&#50 #!$)0&1$#?)&!&!!1&#+?0$0) &%!)&%
0#!)1&.# 5
Table 6
Performance of Each Model Per Prompt
backpack
ball
book
bottle
box
brick
fork
hat
knife
lightbulb
pants
paperclip
pencil
rope
shoe
shovel
sock
table
tire
toothbrush
M SD
semdis-mean 568 568 5,, 56 5 5 58 5-6 56 5,: 5-G 5G 5-- 58 5,G 56 56G 5-G 5-C 5C 58 56
ocs-main 5 5-: 5G9 5GC 5,G 5-6 5,C 5G 5-6 756A 5G 5,: 5-6 5, 5A 5-G 5-9 5G: 5C 5,C 5- 5G
st5-base 5C- 59 5G 5G- 5-, 5,, 5G6 56: 5-C 5,- 59, 5-G 5,9 5-- 5C, 59G 5-C 5-G 5-G 5,A 5-C 59
st5-large 5-9 5G- 5-G 5-C 5-- 5,, 5GC 5 5- 5 5G: 5-6 5,- 5-9 59G 5GG 5-A 5,8 5-6 5GG 5-- 5,
st5-3b 5-8 5-8 5,- 5-G 5,8 5,9 5-: 56 5,: 56: 5G 5,8 5,G 5-6 5GA 5,9 5-: 5,A 5-: 5G9 5-6 5
gpt3_emb-ada 5-9 5C6 5G, 5G9 5-: 5,8 5-A 5G, 5-: 5,8 5G, 5G9 59A 5- 5A 5-: 5-8 5-8 5G6 59- 5G, 5
gpt3_emb-
babbage 5,8 5GA 5,: 5-- 5,: 5, 5G, 5,G 5,: 5,C 5-: 5-A 5GA 5-6 5C9 5G, 5-C 5-6 5,: 599 5-G 5,
t5-base 599 5A6 5AA 5:, 5C 5C 59- 59G 5A, 59A 5:C 5A9 5AC 5G8 5A6 5A6 59 5:9 5:, 5A 5C: 5,
gpt3-ada 5G6 5A: 5C: 5:6 5C9 5C6 5A8 5CC 5A9 5A9 5:- 5AA 5A- 5G9 5A- 5:6 59- 5:G 5:9 5A9 5A6 5-
gpt3-babbage 5C, 5: 5A6 5:: 5C8 .64 5AC 5C9 5A: 5A, 5:: 5A- .85 5G: 5:A .83 5C8 5C8 5:- 5:C 5A- 5A9
gpt3-curie 5A 5A8 .78 5:C 5A- 5C 5A8 5C6 5AA 5A 586 5A9 5:6 59G 5:C 5: .71 5:A 5:C 5:6 5A- 5C
gpt3-davinci .80 .84 5A .88 .74 .64 .83 .77 .81 .83 .91 .79 5:9 .56 .91 5A8 5C8 .90 .92 .81 .80 .09
A

Note.7#  &%#$&##"")&&$&1#1&$$ &)#""# &%5/)
)$)+%#!#+ &*!#$!"&%5
Robustness to Size of Training Data
#?!#)&()H&""%I&$.L0&)#!0&$&($&(&( #!$)&
"?7)0#$&)=/#?&$5+,6,6>+ &(0.%&$&&&)*"# 1."?2& $)5
<()0#?)0"# &%#"07-ada&!babbage7)H! #!$)"7!?0
!""###)#"&(!&&509S#"0!&&=:6G&($&$)>+r B5C"#
babbage&!r B599 "#ada&)$$#&$ #1 )#10&)$ #!$)5!!#&$
&(!&&) #&&$.()#$1$#""5<#babbage+"#7""0)#"0#&$
 #1 ))<(#%%?0@)G6S#"0!&&5
Figure 1
Effect of increasing training size
0$&( #!$$&) #""%1$.?0"?&(2& $)50))
&%$&$.&&?01$))&(!&&3C6$&$)+#S#"0"$$)500&$#?
%##"&($&$)+r B5G:?0babbage?0$ada0&)&"# &%#"r B5-+5CG#"0
"# &%#"0(( #!$50&$&1"# &%&#(#?)I%*$.#658?0:6G
$&$)+0)$&1$.)&$?6589658A"# 666S#"0&($&$)5
0$))0&,66&($&$)+0"# &%#"&$&($&(&( #!$))$$
#&$.)#(0&0&)$ #!$)"#  )&!50)()?#I)#)30#?
$$!&&)!!# &%0&)$"# &%="?7)0#>+&!0#?(##!%&&
?0#any&(!&&=H#7)0#>+)#$$.#0)(0#")!)&!(#"$&(&(5
:

&#&%0!0)I)#?0#&."7(5&0+?)!&#7#"707#2
P1&$$&O #!$&!)#?&$!)&!(#"$&(&(5)(7-O)$&() #!$
)H+?%#)%!?## .)5<#"?7)0#((+?%#)%!&2# 
&)*("#&&(#"0#?#(&$&%0))+0 &!9&(2& $)&!6)
2& $)+&!#1!!&()"#0&(2& $)5&$A)0#?)&2& $# 
0)).$50)%&$?&) $$!.6+!#0#?2(&#"%#)3"$$ )
%#&)&)($#*?0&)!% &$ )&0#*)+?0%0 &)0!%1
!%)#)"#&%0)%#&00&#5
Table 7
Example of a Few-Shot Text-to-Text Prompt and Completion
Example Prompt Model Completion
Below is a list of uses for a SOCK. On a scale of 10-50, judge how
original each use for a sock is, where 10 is 'not at all creative'
and 50 is 'very creative':
USES
1. to use it like a puppet.
2. You can put googly eyes and make a sock puppet show.
3. You can color it and maybe make a snake.
4. a cool and funny puppet.
5. maybe you could put it on your hands and pretend to have
superpowers.
6. using it as gloves.
7. you could use it for ASMR
8. Cut them and make a 3D sculpture.
9. you can make a dress for your doll
10. to use it like a backpack or store money in it
RATINGS
1. 27
2. 27
3. 32
4. 24
5. 36
6.
20
7. 40
8. 50
9. 45
10. 35
Note30# )?0&)#1!!#0 #!$+%$!(&()"#0")"1 )05
0%# $#)0#?0 #!$%#)"# 0# +)&(&"P6.O0)2& $5
<#H#7)0#+&) $&# ?&))!+?0#2& $)?0&)#15&0+0
#!$0&!#$.$.#)#?&##"P0#?#(&$&%0)"#XO)5%67
96)&)&$)%&$+?)!76&!!1!!.,+ &(0"?7)0#%# $#)?&
&0&$"7#)5)$)&)0#?&$:5
8

Table 8
Zero-shot and Few-shot performance (GPT-3 DaVinci), Overall and by Dataset
N(examples) ALL betal18 bs12 dod20 hmsl motesf motesp setal08 snb17 snbmo09
05- 59 5A 58 5A 5,9 568 56 5- 5:
55G, 5: 5-: 5G- 5G- 5-: 5-A 5,G 568 5-C
Transferability to Unseen Prompts
&!!##)%#())#))+%&$&($&(&( #!$)$&0"# &#"
0&$&))&)*)$"+&!&$!#17"#7) )L&$8&!!)))0)
I)#+&$.( #!$)&!##)#"I#  )&!1&$&(#
&#0+$.))50)?&)%#%&$H!&)0)#())#"0)"$))#"
)1)!)+%&)0))0)%&)0&01#)$.)&$)0!)1)!
$&(&#&%0=55+0) &%!)&%&#&%0>)0#$!2%$&)%1))&.
&(!&&#(?050) #!$"# !&&#1&$$r B5C-=5CC&# 7$1$>+
?0$&)$))0#?!rB5G5,:= &#"# rB585-,>5
Table 9
Per-Prompt Performance for Held-Out Prompts (Pearson Correlation).
model bottle pants paperclip shovel spoon ALL M SD
semdis-mean 5,G 5,6 5G 59 5, 5G 58 56G
ocs-main 5GC 5,A 5,6 5GG 5,- 5,: 5-, 5
t5-base 5G8 5G- 5GC 5-G 5,8 5GA 5G6 56:
gpt3-ada 5A 5C9 599 5G9 59G 5C6 59: 568
gpt3-
babbage
.78 5C8 .56 5CA 59, 5C- 5CG .09
gpt3-curie 5AC .69 59G .72 .58 .63 .66 56:
Discussion
Fine-tuned Large Language Models Greatly Outperform Current Automated Scoring
Approaches
0)$)! #)&!0&"7!$&(7$&(&( #!$)#"# 0%
)&7#"707&&#&%0)"# &1.%#(= &)&$5+,6,6>&! )
=/&.EF#0)#+,6,>5
0 &(!#"0 #1 )2&#!&.51&$&!&(&)?0& &.0
$&() $70 &7@!(!!&&)#")#))+0) ) #!$)0#?!
"# &%#"r =5, (mean of prompts = 658)+?0$"# !&r B5,C (mean of
,6

prompts = 65-).0"7!))!0+9&!7-+&(!"# r B5AC #r B
5:50)0$!##$.&%#))!&&))+&%0?01&&#&)&!&%&)+&$)#
&%#))# )5
1#)?#*0&))0#?01&$#")1)!$&("#)%#(=/%H&*
&$5+,6,,>50))!.)#)0#)"!()?0&!"").$#")1)!$&(+"7
(!&$?#*2 #!$)#$&0).$#"0&)*&!# )5<0+
0)!.?&)!!&)&&$2$#&##"?000) 0#!) &.#)"#
)%#(+&!?2%0)&(##!!&$#"#&$ #1 ).#2$#5
<#%#2+0$ &?0%00 &)%#$&!?00 )$1))$*$.# %0
0(00&0%#$&#?0)&!00 &)0))!.5#!&&)#"
 #1!!%&$)#))+?%# &!&!# H!&()#"0)& )#)@!(!.
!""0 &&)&!"#!&&1&(%#$&##"r B5:-5# &(&)($)#)
@!( #0$))#). &#"&$$#0@!( )=55#&(#"%*4&?(0#
&$$0#0)&1&(!>+0%#$&#?&)r B5::50)1&$ (0!&)0
&#2 &%$(&?0%0?%#$!12%& #!$#%#$&?00 &@!( )3
&%$(0&0)0%)&%0?&#&%0(5
0I)# &)3why&))#%&&$0)%#2L0)&(&!&$#"
?#*?0%0%&!###0)I)#+?0&1)# 1!%"##))$
2$&&#)5)$*$.0&0 &@#.#"0")"# 0&&)#%30&)
) $.0&1&)#(&!1.#)!)&!(#"$&(&(50#)))#"0 #!$)
?0"?&($&$))#)0)1?5<0+&)?0) &% #!$)+0 )0&
P#(&$O!&)&0())#(0#")#0&)0#$!&).#%&5J+)# 
#"0 #1 )&&$)#$*$."# #00!!&)0&0"7("!)5<#
2& $+10#(0?!7!$%&!)#))"#02 )+&)0#$!
%# $2#(0#!)&!?0&) $&!&)2))!?0!""?#!)+
%&#+#)$$(5<#2& $+?0$,C6)#))#"P&?(0O#P&?(0O
? #1!!!$%&#"# 0))"#%*=&$,>+P?(0#0#$!#@%)
$&%O&!P)&)&?(0O#0 &!0!&&50%$#)%#%&$) $&.?
0))#))?&)&&$*.!)##!.0)+?0%0#&$$.&$$#?!0 #
##)0"# &%$&##1#))1)!&#&%0)5
#0#))$0!!&0!&0+)0&0) &.0&1$&!0
0&1#)#"(1&(#)5# &) &. #&1)#0$#?)#0(0)&(+
)#  &.!)0&() #?0$#0) &.)#($.&!0#0 #!5<#0
)&*#"!)%))#+<(,)0#?)0!)##" #!$!%#)+&$#(?00)*?
&!#$ #(##1 #1D#)0#?(##!))#""5)0#?)0#?0"7( #!$)
0? %0 #%$#)$.#00 &!)#0&0&)$)&! !!(
&#&%0)5#0"!()(()1#"&!&(#&))0$&1$.##7# 
"# &%#"brick+?0%00&!0(&))& $)H=,95AS#"0"$$!&&)>+&$)#
#%%!0 #))7!&&))&%0?0!""0 &@!()5&()#"%*0&!0
,

0(0)#.+& &)#"!%&$.5J+0)&)#)#(+&!$##*(.#!
brick+&# O)#.!#)#%#$&?0O)"# &%#0&# 5<0
?#*%#$!""#  &%(&1&&%#!)&( +%0&$$(()1)!
$&( #!$)?0 ##"0!%&$.#"0 &)#%#&( #(&$H&$
#!$)5
Figure 2
Plots of Model Prediction Distribution Density Compared to Ground Truth Human Judgments
Note: #!0@!( ))0#?!#!$5$!).) &#)!#%# &
!)#?0!"")%&$)5*?))&!#$ #(##17 #1)&)%D#!5#!
0)*?)6595
,,

Large Language Models Can Be Robust with Only a Small Number of Training Examples
"#")1)!) &%7!)&%7&)!&#&%0))%0&)0#))!.
&! ))0&0.!##!&(50.?#*"# $.P#7#"707#2O+1
?00.O11)&# #&)#)"#5J+0&$.# &0 &
@!( ))#!&)%+&)?)&?7-)&))0"# &%#") &%!)&%
#!$)?0#$.&) &$$ #"$&$)&)$#?&)S#"#"$$&(!&&5#)1!
0)&("#:# )) $&#)$.D&$!)(?0#$.&0&!"$#"
# )+ &.0&1"?$&$)&%))&.5
<0+?"#!0&0&$!)&!(#"$&(&())%# 1
?0#&."7(5<#&27#72 #!$+%&"(&&#&# "#2
%# $#+"!#&)prompt engineering =&$5+,6,>+ &.)""%5"#!&
)&)%&$$.)("%&$&1$.$#?%#$&#=r B5-+p X566>?0&H#7)0## 
0&) $.&)*!& #!$#&0#(&$.#"&%0)#)?0##1!(&.
2& $)#"&(##!&(=&$A>5#1!("12& $)#")%#)"#0&# K)$$
&)*(&#7"! #!$K&)!0&"# &%#r B5G,5)&))1## 
((?&*)+&!0&$*$.&!@) )?0%0 #10"# &%#"0)
&#&%0501&$#"0)&#&%0)0&!"&$ #!$)& #&!$.&%%))$&!$#?
0&#)0%&1.)&%0%# .5
0))$$&"#&!!( #$&$)+10#(00"# &% #1 )
#(0.&%0?$&$(#$1$#""5)1()#+?$$1&$&$"# #!1(
0*()&%0)#)0&0)#)7$1$%#!! )+&)!#.0&0#)#"0
!&&)))!!0=/&.&$+,6:D/&.&!$1&+,6,D &)&$5+,6,6D#"$%0
#0&$5+,6CD$1&&$5+,66:D$1&&$+,6AD$1&&$5+,68>5<0+?&(0&
0%# .?#$!""# &%# #%0 &*!&&)+# &$H!&!
!#$%0%*!"#%#))%.&!I&$.+&!?0)&!&!H!)#))$)"#&(+
%#))71&$!&#+&!)(5 $&%0 &*)&)!!""%# )=5(5
"#"# &#1&$+##0)E& &,669D 1&$++&!"#
1&#)&&$$&(&(#%))(&)*)+&(&$5+,6:D&(&$5+,68DQ"#
!(&$ )%#$#(.+#?,66:>+&!?#$!&$$#?")1)!$&(?#*#
%# &&$&%#))$%&#)5
There are Limits to the Semantic Distance Theoretical Model
%#)!(0"#") &%!)&% #!$)+0&)0"!(#"(&)
 #?&)?0&?#)1!&$.() &%!)&% !!( #!$)&)!#)5
0 !!()!! #1#0&)$)+#$. &(&$$.3&&1&(
"# &%#"65,,+%# &!#658"#0&)$)5#?1+?00!"" #!$)
%# &!+&))(!#%%!5
,-

 #()+0 #) #&"&%#"#"# &%)0)H#"0 #!$50$
0)H=#"" &&$5+,6,,>&!0I&$.=&""$&$5+,6,6>#"&(!&&&&$)#
 #&"&%#)+?0)(0)& &(!&&&!&%0%+?&$.&$?&.)#)1
"# &% #1 )?0$&( #!$)=&$&&$5+,6,6>50)&!!#0#$!"#
 !!()0)&3st5-base#"# !st5-3b+(7-O)ada#"# !
babbage5
0))$))(()1#"&$ #0&)) #!$.(0)#"
) &%!)&% &)&#2."# !1(0*(5)& #!$O)!)&!(#"$&(&(
 #1!+0$&#)0#")) &%!)&%##(&$.(&#?&*50&)#"#
0))%$&+&"1)(&##"?00) &$$&!$&( #!$)!""%#$!0$
!)&!(0)$ )5
Transparency and Explainability of Machine Learning Models
%0&$$(#"&$$&# &!)%#(+) &%!)&% #!$)&!)&$*+)0
)*#" #(1&#)&))"# 0&(2)+#"# 00 &@!())!&)&
(#!05&(0!$.( #!$)+0(#&$)(&$$.#"$%0$&(&()
0#(&(2&)&$)%&$$.&)#))$5#?1+&.#&!%#)#"2)?$$
%#&&$&.#"1&)1%$&$&))5# )%0&))&)$.&$!0($)0
$&(&(+)%0&)(!&)!)%))(#)(1&#)#"))#)+.&
!&$.0& "$"%#!"!(#("#?&!5$!(&# &! #!$)+&%$&$.#)
?0%0 &.#!&.)!0(07)&*)!%&#&$)()=55+!".(("!)!)
)%0##$)>+?)0#$!0#"#)# 0(3 #I&$&!$))&)!0&&(&$
%#))7)%##"0 &7?$&(&(5
0&)# 0#%&$")#"&# &!).) )&%*(&)+0#(00.
!&#&!&%#"# )&%0)5$*#7!!))?0%0!#)%#!.
1&#)&!@!()+0.%&#&%#))$.&%#))&$$%#$$%!!&&))+&!%&
)%!"#&)) #!%$.500& #!$)&)!&!#?0&!()#0
#+&!&$#$)0!5&%%0#?1+0#?0"$!?#$!)%"#&))&!
?0&??#$!!#&#0 %#$!&$#()%%0&$$(5
)%("#&))&)?$$&)!1$#(P2$&&$O=5(5#&&%0+)%0##$
).%0#$#()+#&>&"%&$$$(%).) )&&%1&&)#")!.=5(5/&!#
&&$5+,6,6D(&$5+,68>5&$$.+0!%!%# $2.#") &%
&#&%0)0&)&"0))%3%#)) $&.)&2$&&$#%))&0
0&&&($#"%# $2!%)#)K&P$&%*#2OK&!0 #!$)0 )$1)&&)#
)%"#!)&$%#$&#)50!0#"))#0!  )#"
2$&&$.K0. &.?#* %0 #%$#)$.#0#??0#0.?#$!+0#&%.
#"0!)&!( &.&$$#?"#!)!&))#)%&#%=!,68>+#"##0
)&*70#$!)=5(5+&)#")#!)>#!1$#!))0&"%&$$.$$(
@!( )5# ?$&($&(&( #!$)7 #&$0(07)&*))+##"
,G

%#%"#)%!&))=/(%%+,6,,>5#?1+) &.%# $2#(0#
2$&0&)#(0 )$1)5%?#*"!)0&?0)!?0&&)#(&)*+
&)*()#P0*0#(0).)O##$.)$)&!)%##"0#?0.
&1!&0%#%$)#+0%#%$)#)0 )$1)!# #&%%&=#@ &
&$5+,6,,>5%#$!0&+" &) &$%&#)?0&%# &#&$
).%0# %&#&%0) $#.!0&%# )&"%&$$$(%&!).%0#$#(%&$
)(+)(0&0 #!$%&2$&)#?#%)) (0)!&)& #&
1&$!.%0%*5
0&&#""&))+0)1)!$&(&#&%0)1)(&!0)
#())#1) &% #!$)5<)+0.)0#?&(&$.%&)!&$.#  %0 &
&)K)%"%&$$.+&%# #)#" $$&)+?0%0)#")0%0&$$()#"1&$!.&!
#))$&))0&%# ?0!1!&$(&!)##7!!))50) &)0 #!$)
%&)($.$##* #$*#)&$&1#&# &#=55+ $$0 &@!()&(
&%0)#)>5%#!$.+)1)!$&(%&%##$&&! #15?0 &
@!( )#%#%#)%&)!# #1 #!$)+?0&))0& &.."#!
) &% #!$)!##0&1&).%#%1)5
2$&&$.&!"&))?$$& #&&#"0!)%))# #1("#?&!+
&!)?#0%#)!(0&%1)#"#0%# )&!&( &%0$&(
#!$)%&1.)&%05<#2& $+0#&#O)##)!%#$)
##%#$)"#)( &%0$&(0(07)&*)!%)# &*(+%$!(!%&#&$
%#2)50)%$!))&!&!H!&)))) #")*)&!0 &#1)(0##))$."#
0 &1#=&$E/#())+,6,>5
Release of Materials
0)"# ( #!$)"# 0))!.&&1&$&$&&"?7&)!##$+&
&1.%#(C5
02 )"# !"#0))!.+!$%&!)#))? #1!+)#0&0
#!$%&#P%0&O.)(02&%)& )#)1&$&#&))&?&(5<#
"&$%&#)+?&%*#?$!(01&$#"0 #!$0&1(0)*#?$!(&!&!
&ALL )$?0#!$%& #1&$0&)&$)#&1&$&$"#"#$50)%#!#)!
&(&##+89S+#"0"$$!&&)5
#!&!)$)"#2 )&&1&$&$&A50)%$!)0!&&)#"
)#))+#%#&()%0#$&)0#(#?#*?00)& %# #)#"#!&&))+
C
0)344#)%#(5!5!
A
0)344(05%# 4 &))12)4$$ Z&Z)!.
,9

&)?$$&)&$.(## &$H&#)&!)$)#0!&&52 )&!&&$.))&
&!&))%"%###*)?0%0%&&?#?)5
Further Work
0)$))!0&&&$1)(&##0"# &%+#)))+
&!&)"&$.#"&(&(&(#!$)"#&# &!1&$&##"!1(0*(
))50)$))&$)00&%%&$.#"0&#&%0+&$)##0!###&(##!!&$
#"?I.50 & &.#&$ #1 )#)!.+""%)0&!"0
1)(&#+&!?%#)!&#)#!.0)#!.#"&#&%0)5
#?&! #1!"# &%+&(&&!0#" #!$&%0%)&!)H) &.
%# &!+&)?$$&)%#$$%##" #&(!&&5!(#!&&))"#)#%)#"#+
#?&!%$&!&&+ &.&$)#""# &%50&)&$)#)0#?0&&)"# 7
&)! #!$)""# !# &7&!&)*7&!&1&(=&(&&$5+,6,6>5
0&)+##&%0(0 0#?#%# $&(1%$&))"%&#&)*+)1&$&$#&!@)
0(&$7.7!)($&(&( #!$#)%&$H ##0$&(&(#"0!# &5<#
2& $+#)!.)%#("#$ &.7&(!%0$!+& #!$0&0&)7&!#
%0$!O)&!%0$!7"&%(2=!# &7&!&1>&!$&$!))#))=&)*7&!&1>
?#$!0.#0)H!#(&$$."# 5
0))!.#)?I)#)&#0$ )#") &% #!$)&)?$$&)0
%0&&%)%)#"0#?)?#*#0+&)!)%))!&$5<?#* &.&$)#
2&!#%#)!0&%&$.#")"#&!!#&$!1(0*()%#()))
)%#(+)%0&)#)))#%0&(+)%#(#")&%)&!%#)I%)&)*)#@)0
+?#*(?0##)$$(+#!".()2&$#1#$)#))5#0((
I)#)?00)%&#""& &)#"%#"!%0!%#3(10&&?
%# $#)0&1&$*$0##!&))#%&!+%&$#?$*$0##!!%#)!%&1#"
)#))?0%0!0 &1#+& #&)#?&!)?#0.&$%&#)
0(07)&*))()5<0+00&))!.#!)&!0$ )#") &%!)&%
0#!)5<#2& $+/&.&!F#0)#=,6,>"#!0& &$.)& &)#"#1$.+
 #1!"# )"$))5))0#$!$*?))%!"#0I*)&!$ &#)3!#
0. #1#0#)%0&$$()+#%&0. #!"!#!#)#L<&$$.+0%)!.)
$.)&!?00($)0$&(&(+?#$!)(#! ?00
0)&  0#!)?#*)"#)%#(0##0&)*)#0$&(&()##5
Conclusion
0))!.+?)!&?&#&%0#&# &!)%#(#"0&$&))
&)*+&)#"!1(0*(5&#&%0&$!"!$&($&(&( #!$)&!
%# &!0 #0%)&7#"707&) &%!)&% #!$)=/&.EF#0)#+,6,D
 &)&$5+,6,6>5
,C

#1&$$"# &%+#"7!&#&%0(&$. #1!#12)(
&)$)+?001&#))1)!$&(&#&%0))0#?!&&1&("# &%#"
r B5A:-?00 &@!()1))r B5::&%#))0&)$) &%!)&%).) )57
&)!) &%!)&%&#&%0)!!#)0#?0)& (&)&)#"! #!$)5
##*(&#)))#0& ##"&(!&&+#&#&%0 #1!?0 #!&&
&$&!.)0#?!(&)#10&)$)?0&)$$&)S#"0&(!&&5&$)#)0#?!
)# &$%&#)?0&"172& $# #&$.untrained?$$#"# 
) &% #!$)+?0%0?$11&$.$$)&0# )#")1)!$&(0)
&&5<&$$.+%#)!(&)"&$.+0&#&%0)$$)0#?!)("%&(&)#
 )0&0&!1)"#5
&$)#%# $!&!!!$%&!&$&(!&&)%# )!#"&))!)+?0
&%0)#)@!(!.&$&)00 &@!()+?0%0?&))!#&&!1&$&
# #!$)50&)#!0&!1!&$0 &&)& "%0#??&.)
=/&.EF#0)#+,6,D &)&$+,6,,>+?#)1!0&0#%# #" $$&)
%&)&$#(0#!%&$5)#"!()&)#($.%#&(("#)1)!
$&( 0#!+$&(%# #)!&&))?$$%&)($. #&0)$#")&%05
0 &.%###"0)&))(&?&1"#&# &!
!1(0*()%#(5/.)0#?(&1.)#(&$.#&$(?0 $7&0 &
@!( )+#)$)0$ #1&# &!!1(0*()%#(#?&!&$%&#)
%&1.)&%0+?0&# &!)#))%#()& #$&$#2)"# $$
&!@!()5&!!##&1#!(0%0&$$()&))#%&!?00 &&)+ &.
&$!$&%)?0&@!()#&%&$+)%0&)1#)?0&$7 "!&%*
(1#)!)50)$)&$)#7%&$&!&&!I.#0$ )#"&# &!
)%#(+#!%(&!&"# 2)( 0#!)?0%0 ))#?&&$.))&!)!.
?0!1(0*()&%05&$$+0)%?#*%#)#0#(#(""#
?0%&1.)&%0#!1$# 0#!)#)%&$#(&$0*( &) &
1&$!+$%&$+&!&""#!&$?&.5
Acknowledgements
0)?#*?&)"!!.0)#"!%&#%%)=>+(&[-69,66885
References
%&+5=,6,,>5#)0&)*)% &%0"$%.%#"#!!1(
0*(L1)(&#?07<(&$5Creativity Research Journal5!1&%#$
$%&#50)344!#5#(4656:646G66G85,6,,5,6GGC9C
%&+5+/0& +5+&@H$+5+ &)+5+<$ )+55+E(&)%&*+5
=,6,>5$.(&# &!#(&$.)%#(#01&$"# #"#&%))#"&1
,A

0*(5Gifted Child Quarterly5!1&%#$$%&#5
0)344!#5#(465AA466C8:C,,6C:AG
/&!#&+5+\&H7#!\(H+5+$+F5+/#+5+&*+5+/&&!#+
5+&%&+5+$7#H+5+#$&+5+/@& )+5+0&$&+5+E&+<5=,6,6>5
2$&&$"%&$$$(%=Q>3#%)+&2## )+##)&!%0&$$()
#?&!)#)$5Information Fusion+58+:,95
0)344!#5#(4656C4@5"")5,685,56,
/&.+55+EF#0)#+55=,6,>5# &(%&1.&)))) ?0 )3
#$&"# "#%# () &%!)&%5Behavior Research Methods+53=,>+A9A
A:650)344!#5#(465-A9:4)-G,:76,676G9-7?
/&.+55++J55+0))+55+#)(+55+/!*+5+0+
M5+<*+5+M+F5+?&$+55+&+5F5+E$1&+5F5=,6:>5#)!%##"
!1!&$%&1&$."# &"%#&$%#%1.5Proceedings of the National
Academy of Sciences+115=9>+6:A68,50)344!#5#(4656A-4&)5A-9-,9
/&.+55+E$1&+5F5=,6,>50.!#!&)( #%&1&%#)) L
2%1&##"0)&$#!""%!1(0*(&)*)5Psychology of
Aesthetics, Creativity, and the Arts+6=G>+-68-850)344!#5#(4656-A4&66,8A
/!*+5+]0$ &+5+F&*+5+E&+55=,6->5)))) #"
!1(0*(. &)#"0)@%1#7)%#( 0#!3""%)#"0 #"#7
!&)&! 7#7&)*#$&$.&!1&$!.5Psychology of Aesthetics, Creativity, and the
Arts+7=G>+-G-G850)344!#5#(4656-A4&66--CGG
/(%%5=,6,>5BigScience Language Open-science Open-access Multilingual
(BLOOM) Language Model50)3440((("&%5%#4()%%4$## 5
/$+55+(+5J5+EF#!&+55=,66->5&%0$$$#%&#5Journal of
Machine Learning Research+3+88-6,,50)344!$5&% 5#(4!#465999948GG8858GG8-A
/#@&#?)*+5+&1+5+F#$+5+E*#$#1+5=,6A>5%0(?#!1%#)?0
)?#!"# &#5Transactions of the Association for Computational Linguistics+5+-9
GC50)344!#5#(464("?8%)
/#?+5+&+/5+.!+5+&0+5+&$&+F55+0&?&$+5+$&*&&+
5+0.& +5+&).+5+E)*$$+5=,6,6>5&(&( #!$)&"?7)0#$&)5
Advances in Neural Information Processing Systems+33+:AA865
0)344!#5#(465G:9964&Q15,6695GC9
/%H&*+5+&(+5+<#0 &+/5+E#$+5=,6,,>50 &%0)&*#13
%# &)##"1&#))1)!$&(&#&%0)"#&# &!)%#(#"!1(
0*(&)*)5The Journal of Creative Behavior5!1&%#$$%&#5
0)344!#5#(46566,4@#%5998
,:

& #!+/5+&0?)7#(&+F5+/&!&$#)+5+EN#+5=,669>5##0G67
.&"#$$#?7#"0#&%))#"&10*(3$1&!?$$0?
$$ 5Gifted Child Quarterly+49=G>+,:-,85
0)344!#5#(465AA466C8:C,696G866G6,
?)+5+ &)+55+<&)+55+&!&+55+E&)0 &+5=886>5
!2(.$&) &%&&$.))5Journal of the American Society for Information Science+
41=C>+-8G6A50)344!#5#(464!G"9
1$+F5+0&(+575++5+E#&#1&+5=,6:>5BERT: Pre-training of deep
bidirectional transformers for language understanding5&Q15
0)344&215#(4&)4:656G:691,
#?+F55=,66:>50 )%"# &#1&$1&$&#2%0&(=,669,66A>3
?!#?# )%"# &#1&$)&%05Acoustical Science and Technology, 29=G>+
,GA,9950)344!#5#(465,964&)5,85,GA
 &)+5+%&+5+/0& +5+(&)%&*+5+.+5+&@H$+5+<$ )+5
5+? &+5+E&&+5=,6,,>5What makes children’s responses to creativity
assessments difficult to judge reliably?&)%) !"#$%&#'51).#"
#(&5
 &)+5+E&+55=,6G>5!)&!("$%.&!#(&$.3$&
1&&$)%15Thinking Skills and Creativity+14+9CCA50)344!#5#(464"C?A8
 &)+5+(&)%&*+5+E#0.+55=,6,6>5&)(!1(0*(
#(&$.?00 &&)&!27 ( #!$)3).%0# %%# &)##" 0#!)5
Psychology of Aesthetics, Creativity, and the Arts+15=G>+CG9CC-50)344!#5#(464(0%)II
<#0 &+/5+E#$+5=,6,,>5Fifty years later and still working: Rediscovering
Paulus et al.’s (1970) automated scoring of divergent thinking tests57'5
0)344!#5#(465-,-G4#)"5#4.@:%
$"#!+F55=896>5&1.5American Psychologist+5=8>+GGGG9G5
0)344!#5#(4656-A4066C-G:A
(+5+"*+5+0#+F5+$$+5+ "+5+EJ&(+57N5=,68>5QK
2$&&$&"%&$$$(%5Science Robotics+4=-A>+&&.A,65
0)344!#5#(465,C4)%##%)5&&.A,6
&(&+5+&&)#1^+5+?&.& !&+5+#+5+/$&(.+5+#?.+5+E
 0+55=,6,6>5Don’t stop pretraining: Adapt language models to domains and tasks.
&Q10344&215#(4&)4,66G568CG
&))+55+1&+5+E$1&+5F5=,6:>50!!&$.&!"&)$.#"
$&.)#&()#"!1(0*(5Frontiers in Psychology+95
0)344???5"#)5#(4&%$)465--:84").(5,6:56-G-
,8

#"$%0#0+5+$$+5+E!)&.+5=,6C>50*()!0#23)&$
!)(#"0)#)#2&""%)%&1!1(0*(&#$)1.5Social Science
Computer Review+34=->+-GA-9850)344!#5#(465AA46:8GG-8-99::A-C
#"" &+F5+/#(&!+5+)%0+5+/%0&)*&.&+5+&+5+0"#!+5+&)&)+
5!5+!%*)+55+$$+F5+$&*+5+(&+5+#$&!+5+$$%&+5+
))%0+51&!+& #%+/5+.+5+)!#+5+ #.&+5+$)+5+_"+5
=,6,,>5Training compute-optimal large language models. &Q15
0)344!#5#(465G:9964&Q15,,6-5999C
#" &+5=888>5#&$)%$&) &%!2(5Proceedings of the 22nd
Annual International ACM SIGIR Conference on Research and Development in Information
Retrieval+969A50)344!#5#(465G94-,C,G5-,CG8
&$&+F5+%&!$)0+5+(0&+5+/#?+5/5+0))+/5+0$!+5+&.+5+
&!"#!+5++F5+E #!+5=,6,6>5Scaling laws for neural language models.&Q15
0)344!#5#(465G:9964&Q15,6656:-C
&##+5+E&&.&&+5=,6,,>5Leakage and the reproducibility crisis in ML-based
science5&Q150)344!#5#(465G:9964&Q15,,6A56A6G:
 +55=,66C>5&?)%&1.))L1?#"0#&%))#"
&10*(=>5Creativity Research Journal+18=>+-5
0)344!#5#(465,6A4)9-,C8-G%@:6Z,
 +55=,66:>5&7&&$.))#"0$&#)0#"%&1&%01 ##0M
&!!1(0*())%#)5The Journal of Creative Behavior+42=,>+6C-65
0)344!#5#(46566,4@5,C,7C69A5,66:56,8652
#@ &+5++55+!+5+&)#+J5+E?&)&?&+J5=,6,,>5Large language
models are zero-shot reasoners.&Q150)344!#5#(465G:9964&Q15,,6958C
&!&+55+E &)+55=88A>5)#$##$&#O)#$ 30$&
) &%&&$.))0#.#"&%I)#+!%#+&!)&##"*#?$!(5
Psychological Review+104=,>+,50)344!#5#(464!%?-9
+55+E(+55=888>5&(0&)#"#@%).#7(&1 &2
"&%#H&#5Nature+401=CA99>+A::A85
+5+J&+5+<+F5+F&(+N5+&.&)0+5+E(+5=,6,>5Pre-train, prompt,
and predict: A systematic survey of prompting methods in natural language processing.&Q15
0)344!#5#(465G:9964&Q15,6A5-9:C
+J5++5+#.&$+5++F5+F#)0+5+0+5+1.+5+?)+5+
N$ #.+5+E#.&#1+5=,68>5RoBERTa: A robustly optimized BERT pretraining
approach5&Q150344&215#(4&)486A5C8,
-6

*#$#1+5+0+5+#&!#+5+E&+F5=,6->5Efficient estimation of word
representations in vector space5&Q150344&215#(4&)4-65-A:
*#$#1+5+)*1+5+0+5+#&!#+55+E&+F5=,6->5)!
)&#)#"?#!)&!0&))&!0%# #)#&$.55F55/()+5/##+
5$$(+N50&0& &+E5M5(=!)5>+Advances in neural information
processing systems 26=5--8>5&))#%&)+%5
0344&)5)5%%4&496,7!)!7)&#)7#"7?#!)7&!70&))7&!707
%# #)#&$.5!"
$&*&&+5+Q+5++5+&!"#!+5+&+F55+?#*+F5+J&+M5+H&*+
5+ +F55+E&$$&%.+5=,6,,>5Text and code embeddings by contrastive pre-training5
&Q150)344!#5#(465G:9964&Q15,,656669+F5+`(#+55+#)&+5+&+F5+
&$$+5/5++5+EJ&(+J5=,6,>5Sentence-T5: Scalable sentence encoders from pre-
trained text-to-text models.&Q150)344!#5#(465G:9964&Q15,6:56::AA
(&)%&*+5+E &)+5=,6,6>5Open creativity scoring # )#"?&'5
1).#"150)344#)%#(5!5!
&*+5+&""+5+#(+F5+0+5+E&!&+55=,6>5English Gigaword Fifth
Edition&&)'5()%&&#)# 50)344!#5#(465-94G<7M:6
&$)+55=8A6>5Computer Simulation of Human Ratings of Creativity. Final
Report. =#58776-,>50)344"$)5%5!5(#14"$$246C6C9:5!"
&$)+55+EH$+F55=8C:>5%#(%&1.)).%# 5Gifted Child
Quarterly+12=,>+A8:-50)344!#5#(465AAS,<66C8:C,C:6,66,6,
(#+F5+#%0+5+E&(+5=,6G>5$#3$#&$1%#)"#?#!
)&#5Proceedings of the 2014 Conference on Empirical Methods in Natural Language
Processing (EMNLP)+9-,9G-50)344!#5#(464(")0?(
$%*+F55+.+55+E+5=,6,,>51(0*(3&$.1?)555
%#E5%&=!)5>+Handbook of creativity assessment5!?&!$(&5
&!"#!+5+&&) 0&+5+&$ &)+5+E)*1+5=,6:>5Improving language
understanding by generative pre-training550)344#&5%# 4$#(4$&(&(7
)1)!4
&""$+5+0&H+5+#)+5++5+&&(+5+&&+5+N0#+J5++5+E
+5F5=,6,6>52$#(0$ )#"&)"$&(?0&"!27#72&)"# 5
Journal of Machine Learning Research+21=G6>+CA50)344@ $5#(4&)41,4,676AG50 $
&@*&+5+N0&(+F5+#.1+5+E&(+5=,6C>5SQuAD: 100,000+ questions
for machine comprehension of text5&Q150)344!#5#(465G:9964&Q15C6C569,96
 )+5+E1.%0+5=,68>5Sentence-BERT: Sentence embeddings using
siamese BERT-Networks5&Q150)344&215#(4&)486:566:G1
-

# $+5+/@&+55+E#!#+55=,6>50#%#"$&)$&$&1)3
1&$&##"%# #))%&)&$&)#(5AAAI Spring Symposium: Logical
Formalizations of Commonsense Reasoning+86895
!+5=,68>5#2$&($&%*#2 &%0$&( #!$)"#0(0)&*)
!%)#)&!)&$ #!$))&!5Nature Machine Intelligence+1=9>+,6C,95
0)344!#5#(4656-:4)G,,9C768766G:72
%#+55=88>5Divergent thinking5$2$)0(##&##?##!+F5
%#+55=,66:>5&1.&!!%&#5New Horizons in Education+56=>5
0)344%5!5(#14L!BF:-,86
%#+55+!$$&+55+&*+55+$7F&) +<55+E$)?&!+55=,6C>5
0%0)#"!1(0*())LCreativity. Theories – Research - Applications+3=>+G
:50)344!#5#(465994%&7,6C7666
%#+55+E%&+5=,6,>51(0*(&)&!%&##"%&1#&$5
Creativity Research Journal+24=>+CCA950)344!#5#(4656:646G66G85,6,5C9,8,8
%#+55+$$&+5+%&+5+E& #!+/5=,66>5#&%))#"&1
0*(&)!%#)#")#&$&!$%&%01 3"".7.&"#$$#?75Creativity
Research Journal+22=G>+-C-C:50)344!#5#(4656:646G66G85,6659,--8-
%#+55+E&H+5=88,>5%#(!1(0*()))(#&$!&#&$
#&!&%&1.!25Educational and Psychological Measurement+52=>+,-,,5
0)344!#5#(465AA466-CGG8,69,66,C
&!7?&$.+5+&.$#+55+& &!&+5+E/&#+/5=,6,,>51(0*(
&!%&1&%01 K#?)#()0$*L!&! &7&&$.))5Psychology of
Aesthetics, Creativity, and the Arts5!1&%#$$%&#5
0)344!#5#(4656-A4&%&666696A
0&?+5=,6,>5?#*)_%&? &*&)L%# &)##"0)@%1
)%#(!2)0&)))) #"!1(0*(5Thinking Skills and Creativity+40+
66A:850)344!#5#(4656C4@5)%5,6,566A:8
$1&+5F5=,6>5@%1)%#(#"!1(0*(32& (0$&$.#"
)&$))+)&%)+&!%#)I%)&)*)5Thinking Skills and Creativity+6=>+,G-65
0)344!#5#(4656C4@5)%5,6656C566
$1&+5F5+)& +55+E/&.+55=,6A>5$!#?L1&$&(0#$!4?
)%#( 0#!"#!1(0*(&)*)5The Journal of Creative Behavior+51=->+,C,,G5
0)344!#5#(46566,4@#%56
$1&+5F5+)& +55+/(+5+&+5+EO##+5=,668>5))#
2%+$&)%.+&!%&1.32$#($#?7#!+0(07#!+&!&%1""%)5
Journal of Research in Personality+43=C>+6:A68650)344!#5#(4656C4@5@5,66856G569
-,

$1&+5F5+)+/55+$$)+F55+/&#&+55+& +F55+))+55+
&H+F55+E%0&!+55=,66:>5))))(%&1.?0!1(0*(&)*)3
2$#(0$&$.&!1&$!.#"?)@%1)%#( 0#!)5Psychology of
Aesthetics, Creativity, and the Arts+2=,>+C::950)344!#5#(4656-A48-7-:8C5,5,5C:
.!+55+& #!+F55+#0 &+55+E&H7/#%##+F5=,68>5
&1. &) !(&!&)!)"# 8:G,6-3).) &%1?5
Psychology of Aesthetics, Creativity, and the Arts+13=,>+--G-5
0)344!#5#(4656-A4&%&6666,,:
#%0+5+$.(+5++F5+0&(+F5+&(+55+(+5+E#)+5=,6->5
%)1! #!$)"#) &%%# #)#&$.#1&) &*5Proceedings of
the 2013 Conference on Empirical Methods in Natural Language Processing+C-CG,5
0)344&%$&0#$#(.5#(4-7A6
#&%+55=8CC>5Torrance test of creative thinking: Norms-technical manual
research edition-verbal Tests, forms A and B-figural tests, forms A and B5%#3)#$
))5
#&%+55=8A,>5!%11&$!.#"0#&%))#"&10*(5
The Journal of Creative Behavior+6=G>+,-C,9,50)344!#5#(46566,4@5,C,7
C69A58A,5668-C52
#&%+55=8:6>5#?(%&1$.("!3,,7.$#(!&$)!.5
Creative Child & Adult Quarterly+5=->+G:9:+A65
&)?&+5+0&H+5+& &+5+)H*#+F5+F#)+5+# H+55+&)+5+
E#$#)*0+5=,6A>5Attention is all you need.&Q150344&215#(4&)4A6C56-AC,
&$+5+E/#())+<5N5=,6,>5 .)".(0&""%&$$$(%
%K&$.)(0(##!+0&!+&!0%$&$ )#"0##)!&#&%05
Computer Law Review International+22=G>+8A,50)344!#5#(4658A:94%7,6,7,,6G6,
##0)+55+E& &+55=!)5>5=,669>5TREC: Experiment and evaluation in
information retrieval5))50)344 ))5 5!48A:6,C,,,6A-C4%4
&$$&%0+55+E#(&+5=8C9>5Modes of thinking in young children5?J#*5
&(+5+*)&%0&*+J5+&(&+5+(0+5+%0&$+F5+$$+<5+1.+5+E
/#? &+5=,68>53)%*%0 &*"#(&$7#)$&(&(
!)&!().) )5Advances in Neural Information Processing Systems+325
0)344&)5)5%%4&4,6840&)04GG8C",G&"A"&C"6GC"G8,-!&:!C7)&%50 $
&(+5+(0+5+%0&$+F5+$$+<5+1.+5+E/#? &+5=,6:>53
$7&)*%0 &*&!&&$.))$&"# "#&&$$&(&(!)&!(5Proceedings of
the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for
NLP+-9--9950)344!#5#(465:C9-414:79GGC
--

J&(+N5+&+N5+J&(+J5+&#$$+F5+&$&*0!#1+55+E+M55=,68>5
Q$3&$H!&#())1&("#$&(&(!)&!(5Advances in Neural
Information Processing Systems+325
N&%%&#+5F5+#$$.+5+%0%*+55+&H&+55+J#(+55+$%$$+5
5+$&+55+#)+F55+E/&0#$# ?+55=,69>50"$%#"0(0#!
%#(1%&&%)#$&!#(&H&#&$%#&%&!#30 !&(#$#"
!1$# &$2%)5The Leadership Quarterly+26=->+-G,-9:5
0)344!#5#(4656C4@5$&I&5,6956-566A
... The present work, in contrast, trained a CNN specifically with the goal of optimizing creativity prediction on a specific drawing task, with a specific set of drawing prompts. Future AUTOMATED DRAWING ASSESSMENT 30 studies on verbal creativity assessment may similarly benefit from "fine-tuning" large language models (e.g., BERT, GPT3) to maximize their prediction of individual verbal responses, which may boost the signal of creativity prediction at the response level (see Organisciak et al., 2022). ...
Full-text available
Preprint
The visual modality is central to both reception and expression of human creativity. Creativity assessment paradigms, such as structured drawing tasks (Barbot, 2018), seek to characterize this key modality of creative ideation. However, visual creativity assessment paradigms often rely on cohorts of expert or naïve raters to gauge the level of creativity of the outputs. This comes at the cost of substantial human investment in both time and labor. To address these issues, recent work has leveraged the power of machine learning techniques to automatically extract creativity scores in the verbal domain (e.g., SemDis; Beaty & Johnson, 2021). Yet, a comparably well-vetted solution for the assessment of visual creativity is missing. Here, we introduce AuDrA—an Automated Drawing Assessment platform to extract visual creativity scores from simple drawing productions. Using a collection of line drawings and human creativity ratings, we trained AuDrA and tested its generalizability to untrained drawing sets, raters, and tasks. Across 4 datasets, nearly 60 raters, and over 13,000 drawings, we found AuDrA scores to be highly correlated with human creativity ratings for new drawings on the same drawing task (r = .64 - .93; mean = .81). Importantly, correlations between AuDrA scores and human raters surpassed those between drawings’ elaboration (i.e., ink on the page) and human creativity raters, suggesting that AuDrA is sensitive to features of drawings beyond simple degree of complexity. We discuss future directions, limitations, and link the trained AuDrA model and a tutorial (https://osf.io/kqn9v/) to enable researchers to efficiently assess new drawings.
ResearchGate has not been able to resolve any references for this publication.