Need help about CRC hash for movie thumbs with foreign language chars.
#1
Hi all,

I need a little help on the crc hash calculating to find thumbs / fanarts.

Specifically with foreign language chars in filenames.

I use the movieview to get the info and the Hash function in C# given in sample in the Wiki.

It works well for english filenames.

Exemple of bad returns :
smb://NAS/Films2/2ème Sous-Sol (P2) [2007]/2ème Sous-Sol.bluray.(2007).mkv

i tried with utf8 and iso but can"t find the good crc (75556f66)

i find : 5b292925 with data from database or api or 2dcaa2f7 with "smb://NAS/Films2/2ème Sous-Sol (P2) [2007]/2ème Sous-Sol.bluray.(2007).mkv"

Other files in the same dir like :
smb://NAS/Films2/.45 (.45) [2006]/.45.(2006).mkv

Works perfectly.

Thanks for your help

Tolriq.
Reply
#2
The path needs to be utf8, and we lower-case prior to conversion. Note that the lower-casing is done in ascii (i.e. it'll do incorrect lower-casing)

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#3
Thanks for you quick reply Smile

I could have search long Smile

For information the correct code to calc the hash in C# is :

Code:
using System;
using System.Text;

public static string Hash(string input)
        {
            var chars = input.ToCharArray();
            var newchars = new char[chars.Length];
            for (var index = 0; index < chars.Length; index++)
            {
                newchars[index] = Char.ToLower(chars[index]);
            }
            input = new string(newchars);
            var mCrc = 0xffffffff;
            var bytes = Encoding.UTF8.GetBytes(input);
            foreach (var myByte in bytes)
            {
                mCrc ^= ((uint)(myByte) << 24);
                for (var i = 0; i < 8; i++)
                {
                    if ((Convert.ToUInt32(mCrc) & 0x80000000) == 0x80000000)
                    {
                        mCrc = (mCrc << 1) ^ 0x04C11DB7;
                    }
                    else
                    {
                        mCrc <<= 1;
                    }
                }
            }
            return String.Format("{0:x8}", mCrc);
        }
Perhaps to put in the wiki, i don't know if it's work exactly the same on Mono.

Tolriq.
Reply
#4
Well finally it seems it doesn't work for upper chars Smile

Like Ï Sad

If some c# expert can help Smile
Reply
#5
Ok so after a beer and good food my brain lights up Smile

I decode the "incorrect ascii lower casing" Smile

The real working code follow :

Code:
using System;
using System.Text;

public static string Hash(string input)
        {
            var chars = input.ToCharArray();
            for (var index = 0; index < chars.Length; index++)
            {
                if (chars[index] <= 127)
                    chars[index] = Char.ToLowerInvariant(chars[index]);
            }
            input = new string(chars);
            var mCrc = 0xffffffff;
            var bytes = Encoding.UTF8.GetBytes(input);
            foreach (var myByte in bytes)
            {
                mCrc ^= ((uint)(myByte) << 24);
                for (var i = 0; i < 8; i++)
                {
                    if ((Convert.ToUInt32(mCrc) & 0x80000000) == 0x80000000)
                    {
                        mCrc = (mCrc << 1) ^ 0x04C11DB7;
                    }
                    else
                    {
                        mCrc <<= 1;
                    }
                }
            }
            return String.Format("{0:x8}", mCrc);
        }
Reply
#6
Nice work.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#7
Nicely done, fixed something we would have run into for sure with UMM. I'm sure this saved me a bunch of time later when we start interfacing with xbmc images and changing them out.

For mono, it'll work as is.. if your target framework is 2.0 (which is recommended as it's the most mature for mono), you'll want to be explicit in the var declaration.

here's what we use in UMM, modified with your changes, geared for mono, based off the xbmc wiki ..

Code:
public string Hash(string input)
            {
                char[] chars = input.ToCharArray();
                
                for (int index = 0; index < chars.Length; index++)
                {
                    if (chars[index] <= 127)
                    {
                        chars[index] = Char.ToLowerInvariant(chars[index]);
                    }
                }
                input = new string(chars);
                byte[] bytes = Encoding.UTF8.GetBytes(input);
                uint m_crc = 0xffffffff;
                foreach (byte myByte in bytes)
                {
                    m_crc ^= ((uint)(myByte) << 24);
                    for (int i = 0; i < 8; i++)
                    {
                        if ((System.Convert.ToUInt32(m_crc) & 0x80000000) == 0x80000000)
                        {
                            m_crc = (m_crc << 1) ^ 0x04C11DB7;
                        }
                        else
                        {
                            m_crc <<= 1;
                        }
                    }
                }
                return String.Format("{0:x8}", m_crc);
            }


The following code was originally converted from xbmc's source to c# (which will work under moonlight and silverlight), i've also updated it with your fix
Code:
public class Crc32  //derived from XBMC source code, converted to c#
            {
            public Crc32()
                {
                    Reset();
                }
    
            Crc32(Char[] buffer, int count)
                {
                    Reset();
                    Compute(buffer, count);
                }

            Crc32(string strValue)
                {
                    Reset();
                    Compute(strValue.ToCharArray(), strValue.Length);
                }
                            
            public UInt32 getCRC_decString(string strValue)
                {
                    ComputeFromLowerCase(strValue);
                    UInt32 result = m_crc;
                    Reset();
                    return result;
                    //hex test
                    //string hexValueLZA = String.Format("0{0:X}", m_crc);
                    //hex back to string test
                    //int decAgain = int.Parse(hexValueLZA, System.Globalization.NumberStyles.HexNumber);
                }

            public string getCRC_hexString(string strValue)
                {
                    ComputeFromLowerCase(strValue);
                    string hexValue = m_crc.ToString("X");
                    //convert to HEX String format for return
                    string result = String.Format("{0:x8}", m_crc);
                    Reset();
                    return result;
                    //return String.Format("0{0:X}", m_crc);
                }

            void Reset()
                {
                m_crc = 0xFFFFFFFF;
                }
        
            private UInt32 m_crc;
            
                void Compute(Char[] buffer, int count)
                {
                    //loop through char array and pass each char to Compute()
                    int curCount = 0;
                    while (curCount < count)
                    {
                        Compute(buffer[curCount]);
                        curCount += 1;
                    }
                }
        
                void Compute(Char value)
                {
                  m_crc ^= (uint)(value << 24);
                  for (int i = 0; i < 8; i++)
                  {
                    if ((m_crc & 0x80000000) != 0)
                    {
                      m_crc = (m_crc << 1) ^ 0x04C11DB7;
                    }
                    else
                    {
                      m_crc <<= 1;
                    }
                  }
                }
        
                void Compute(string strValue)
                {
                  Compute(strValue.ToCharArray(), strValue.Length);
                }
                
                void ComputeFromLowerCase(string strValue)
                {
                    char[] chars = strValue.ToCharArray();

                    for (int index = 0; index < chars.Length; index++)
                    {
                        if (chars[index] <= 127)
                        {
                            chars[index] = Char.ToLowerInvariant(chars[index]);
                        }
                    }
                    string strLower = new string(chars);
                  Compute(strLower.ToCharArray(), strLower.Length);
                }
           }
Reply
#8
Glad i helped Smile

I started learn c# 3 weeks ago :p come from php my last windows application was 10 years ago C++ / MFC Smile

Thanks for the info for Mono so i think i'll update the wiki with your code more standard declarations, at this time my projet is WPF so .NET 3.5 Smile

Too many things to learn my head is going to explode Smile
Reply
#9
I won't reopen a thread so Smile

Just an information for others, and perhaps a question to dev.

For stacks, the hash for the thumb must be calculated from the 1rst file of the stack (seems logical when reading the doc) but the hash for the fanart is calculated from the full stack string. (ie : stack://smb://NAS/Films1/XIII - Part 1 (2008)/XIII.CD1.mkv , smb://NAS/Films1/XIII - Part 1 (2008)/XIII.CD2.mkv) less logical i think.

Should be like the thumb or the standard way : directory + filename.
Reply

Logout Mark Read Team Forum Stats Members Help
Need help about CRC hash for movie thumbs with foreign language chars.0