how to enable DataEncodation.Mode.Kanji ?

Oct 31, 2012 at 5:05 AM

It's possible ?

Coordinator
Oct 31, 2012 at 8:15 AM
Edited Oct 31, 2012 at 8:16 AM

Its already enabled. It will determine by your input string. If your string not fully kanji, it will most likely use utf8. 

 

Also which language are you trying to encode?

 

I'm asking that as kanji = Japanese kanji. Some what feel like you are asking for Chinese. 

Oct 31, 2012 at 2:21 PM
Edited Oct 31, 2012 at 4:52 PM

Ups, my bad. I mean can i use SHIFT_JIS character encoding ? Because recognition not working properly i gues. If u try encode & decode russian word "привет", u will get "привет", but if u try another one word "пока", u get just this "п". With another qrcode encoders like this http://zxing.appspot.com/generator/ work fine with SHIFT_JIS.

Coordinator
Oct 31, 2012 at 5:44 PM

It is using kanji. I will take look later see what's going on. 

Coordinator
Nov 1, 2012 at 9:33 AM
Edited Nov 1, 2012 at 9:37 AM

I think that's a lot to do with decoder. The one use QrCode.Net, can be properly decode by zxing. Where zxing and our's encoder's version can not be decode by ingma mobile decoder. 

Currently QrCode.Net is base on ZXing, but more restrict on shift-jis.  Our encoder is 100% according to ISO specification, where zxing is not for shift-jis.

Shift-jis under iso specification, only first 225 char should be used for 8 bit byte. As what iso said is most original shift-jis table, not full version. Where zxing uses full version. Shift-jis table have been expand for several times.

Another reason why our code can be decode by zxing is we are base on their encoder. It was originally re-craft on top of zxing. So even though our rule is restrict, but still within their range. Where I have never seen any other decoder's internal code.

One down side of our encoder is that we automatically decide which encoder to use. That's something I want to address later for advanced encode. As 99% of people who uses encoder, they want auto decide for them. (Well my assumption. As most people never read iso spec, they also never read char table. There is no way for them to decide which table is best for what situation). 

 

One way could resolve it if you know your target is most likely use shift-jis or utf8. I can tell you which code to remove, then it will only check for specific table without use kanji. I bet a lot of decoder don't care about kanji table, as its not widely used even for Japanese. So once remove detection code you can compile and have your own special version without big issue. 

 

Let me know if you want to do that. 

 

Edit: For what I mean ZXing use full table. Shift-jis first 225 chars are what we call original shift-jis table, where rest of table is what Qrcode ISO specified as kanji table. Reason I said even Japanese won't use kanji much, as their language usage a lot of time will be in first 225 chars table or mixed with kanji. Pure kanji is rare from my understanding. 

Nov 2, 2012 at 8:00 AM
Edited Nov 2, 2012 at 8:01 AM
silverlancer wrote: 

Let me know if you want to do that. 

 Yes. I want to do that. :)

Coordinator
Nov 4, 2012 at 5:51 AM

Recognition class is under Gma.QrCodeNet.Encoder.Dataencodation.InputRecognise.cs

Inside has a method call Recognise(string content)

It will basically go through whole set of encode and find proper one. 

Simply change following code:

int tryEncodePos = ModeEncodeCheck.TryEncodeKanji(content, contentLength);

to

int tryEncodePos = 0;

It will skip kanji detection. 

For eight bit byte, you can check char tables under Gma.QrCodeNet.Encoder.Dataencodation.ECIset.cs.

Remove any table that you don't want include inside detection. That will do the trick. 

 

UTF8 should be best option for most decoder. But it will result large QrCode size. It has its pro and cons. Currently decoder on market are rather messy. Hardly anyone want to follow ISO specification, also because of localized ISO specification makes some chaos around QrCode encode and decode. That's something really bad for special European chars. 

If you want most of decoder able to read your code, UTF8 is best, if code size is really important, like print on business card. Then specific char table should be used, then it will have risk of some decoder can not decode.

 

Good luck on your project, let me know if you have anything not clear.

 

Nov 17, 2012 at 9:38 PM

No bro, unfortunately int tryEncodePos = 0; it does not work :(

Coordinator
Nov 17, 2012 at 11:16 PM

Have you remove int tryEncodePos = ModeEncodeCheck.TryEncodeKanji(content, contentLength); ?

Can you pass the code you have changed to me? I will take look for you. 

Nov 17, 2012 at 11:59 PM
Edited Nov 18, 2012 at 12:02 AM

Yes of course I remove it.

//int tryEncodePos = ModeEncodeCheck.TryEncodeKanji(content, contentLength);

int tryEncodePos = 0;

Just insert in textbox word "пока" & try to decode it. You will see a different word.

Coordinator
Nov 18, 2012 at 4:17 AM
Edited Nov 18, 2012 at 4:20 AM

Works fine on my decoder. 

I have looked inside, and found out it start using ISO-8859-5 char table. If some decoder doesn't have that ECI char table knowledge, then they won't be able to figure out how to decode. 

Recently discover so many cheap decoder issue. Guess my next goal will be given some choice, so developer can choose if they want to run in safe mode or not. Where safe mode will be strict to most commonly and widely used character table. 

One way to resolve it is keep that "int tryEncodePos = 0;" there and try look into ECISet.cs. That file is under dataEncodation. 

Inside initialize method, disable all char table other than iso-8859-1, shift_jis and utf-8. And try encode again. 

I will most likely spend some time during Christmas and get those thing sorted. 

Hope that solves your problem, let me know if otherwise. 

 

Edit: Comment out AppendECI method and don't change any value inside. Like shift_jis = 17, those are ECI ISO set, value is according to ISO table, not index. If you are interested at those table, I have two link inside source file.