encoding - Converting ISO-8859-1 to UTF-8 for MultipartFormData in Play2 + Scala when parsing email from Sendgrid -


i have hooked play2+scala application sendgrid parse api , i'm struggling in decoding , encoding content of email.

since emails in different encodings sendgrid provides json object charsets:

{"to":"utf-8","cc":"utf-8","subject":"utf-8","from":"utf-8","text":"iso-8859-1","html":"iso-8859-1"} 

in test case "text" "med vänliga hälsningar jakobs webshop" if extract multipart request , print out:

logger.info(request.body.dataparts.get("text").get) 

i get:

med v?nliga h?lsningar jakobs webshop 

ok given info sendgrid let's fix string utf-8.

def parsemail = action(parse.multipartformdata) {     request => {      val inputbuffer = request.body.dataparts.get("text").map {         v => bytebuffer.wrap(v.head.getbytes())     }      val fromcharset = charset.forname("iso-8859-1")     val tocharset = charset.forname("utf-8")      val data = fromcharset.decode(inputbuffer.get)     logger.info(""+data)      val outputbuffer = tocharset.encode(data)     val text = new string(outputbuffer.array())      // save stuff mongodb instance  } 

this results in:

med v�nliga h�lsningar jakobs webshop 

so strange. should work. wonder happens in body parser parse.multipartformdata , datapart handler:

def handledatapart: parthandler[part] = {         case headers @ partinfomatcher(partname) if !fileinfomatcher.unapply(headers).isdefined =>           traversable.takeupto[array[byte]](default_max_text_length)             .transform(iteratee.consume[array[byte]]().map(bytes => datapart(partname, new string(bytes, "utf-8")))(play.core.execution.internalcontext))             .flatmap { data =>               cont({                 case input.el(_) => done(maxdatapartsizeexceeded(partname), input.empty)                 case in => done(data, in)               })             }(play.core.execution.internalcontext)       }  

when consuming data new string created encoding utf-8:

.transform(iteratee.consume[array[byte]]().map(bytes => datapart(partname, new string(bytes, "utf-8")))(play.core.execution.internalcontext)) 

does mean iso-8859-1 encoded string text encoded utf-8 when parsed? if so, how should create parser decode , encode params according provided json object charsets? i'm doing wrong can't figure out!

you'll need copy implementation of parse.multipartformdata function, changing decodings utf-8 iso-8859-1, , use in action.

the problem play decodes utf-8 default, , there no way change that, other implementing own parser.


Comments

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -