Accepting number and international Characters into the name

Fai_FongFai_Fong Member
edited January 2019 in CCCApply
We have discovered completed applications with just numbers in the last name and name with Chinese characters.  All of which throws our system off.  Is there reason why we allow numbers into the name.  And suggestion as to what to do with Chinese characters .  

Comments

  • edited March 2018
    To convert the Chinese characters, you'll need to set the stripDiacritics to "true"  The stripDiacritics attribute will convert certain diacritic (non-English or foreign)characters to standard. Please see the example below.

    image

    As for numbers in the name, I'll need to check on that and get back to you.
  • Fai_FongFai_Fong Member
    edited March 2018
    We do have the setting for stripDiacritics to true.    It's not working.   Do we need then encoding="UTF-8" ?

    <?xml version="1.0" ?><formatDefinitions xmlns="http://xmlns.cccnext.org/xfer">;
     <formatDefinition outputFormat="delimited" id="smcformat" applicationType="apply" delimiter="|"
     stripDiacritics="true">
    <fieldList>

    The second part.  We are also getting all numbers in a name.  It's that expected?

  • edited March 2018
    It's unusual, but there are people with numbers in their names. Therefore, we do not check for, nor prevent numbers in names.
  • edited March 2018
    Yes, you should be using encoding="UTF-8".

    Would you mind sending me the CCCID and Confirmation number of that app? Please email it to [email protected] I'd like to take a look at it as it is possible it is a fraudulent app.
  • severasevera Member
    edited March 2018

    I'd be interested to know what 'its not working' means.

    Fyi, there is a bug in the strip diacritics behavior, specifically with some Chinese characters. It only occurs with fixed length files, which yours seems to be. What happens in that case is the length of the field is not what is defined in the format file. Its not a huge issue if you don't encounter those particular characters tho.

    Also, correct me if I'm wrong, but strip diacritics should work whether ouptut format is set to utf-8 or not. Correct?

  • Fai_FongFai_Fong Member
    edited March 2018
    Kasey,  I added encoding and reset a couple of applications containing Chinese characters.  The resulting name I see is just "?".    Let me know if that is expected.   Thanks,   
  • edited March 2018
    It looks like this is a known issue. Validation for Chinese (non-standard (non-English)) characters is being researched and has been prioritized for the next annual update. I do not have a eta, but for now, know that it is being looked into.
  • JoeHoJoeHo Member
    edited October 2018
    Kasey, when viewing the _main or _inst file in notepad++  when encodings are changed (from UTF-8 to ANSI or ASCII) we notice some special charaters will create a mis-alignment in the row with extra spaces, it looks correct under UTF-8, however our SIS does not process UTF-8, thus mis-reading the input. Is there a fix for this to strip out the special characters?
  • John_SaricJohn_Saric Member
    edited December 2018
    This is our code
    -<formatDefinitions xmlns="http://xmlns.cccnext.org/xfer">
    -<formatDefinition
    id="gcccdMain" outputFormat="fixed">
    -<fieldList>

    TO adjust for foreign character with accents, would we add???

    -<formatDefinitions xmlns="http://xmlns.cccnext.org/xfer">
    -<formatDefinition
     id="gcccdMain" outputFormat="fixed" stripDiacritics = "True">
    -<fieldList>

    Would this solve the problem????





  • JoeHoJoeHo Member
    edited December 2018
    John, I believe the issue that we are having is that our SIS does not process UTF-8 propoerly. We are thinking about putting something to convert the UTF-8 file to ASCII, after it is downloaded from CCCApply, but unfortunately that won't handle anything that's outside of the ASCII char set.
  • John_SaricJohn_Saric Member
    edited December 2018
    So, would this include spanish caracters with accents marks or foreign characters with 2 dots above an 'a'?
  • JoeHoJoeHo Member
    edited December 2018
    It will be limited to ascii and extended ascii, which includes what you are referring to (see http://www.asciitable.com/) the difficult part will be programming the actual conversion.

  • John_SaricJohn_Saric Member
    edited December 2018
    so, would this solve the accent, tilda, and 2 dots above the letter problem?


    -<formatDefinitions xmlns="http://xmlns.cccnext.org/xfer">
    -<formatDefinition
     id="gcccdMain" outputFormat="fixed" stripDiacritics = "True">
    -<fieldList>



Sign In or Register to comment.