unicode - Determine all ISO 15924 script codes in JavaScript string -


i'm looking efficient way take javascript string , return of scripts occur in string.

full utf-16 including "astral" plane / non-bmp characters require surrogate pairs must correctly handled. possibly main problem since javascript not utf-16 aware.

it has deal codepoints no fancy awareness of complex scripts or grapheme clusters necessary. (this obvious of anyway.)

example:

stringtoiso15924("παν語"); 

would return like:

[ "grek", "hani" ] 

i'm using node.js , unicode libraries such xregexp , unorm don't mind adding other libraries might handle or ease such feature.

i'm not aware of javascript library can character properties such script codes, second part of problem.

the third part of problem avoid inefficiencies.

i answered a similar question, @ least related. in this pastebin (looooong) function returns script name character. should easy modifiy accommodate string.


Comments

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -