Below are the principles due to which Crowdin counts words:
- The word is a combination of letters, punctuation marks and/or special characters (e.g.:@ # $ % ^ & * – _ ` ‘ “) followed by space
- The sequence of punctuation marks or special characters is not considered as a word
- HTML tags are considered as separate words for most of the formats, except the following ones: HTML, HAML, MD, XML, IDML, XLIFF, DOCX, DITA
- URLs (e.g.: https://crowdin.com) and emails (e.g.: support@crowdin.com) are considered as 1 word
- Hieroglyphs in Chinese, Japanese and other hieroglyphic languages are counted as 1 word/hieroglyph. For example: “ライフ・イン・トウキョウ。” is counted as 10 words
Other examples of how the words are counted:
String | Words |
Number is -123.45 | 3 |
<a href=”{0}” target=”_parent”>here</a> | 1 / 7 (if non-HTML file format is used) |
{0} – {1} at {2} | 4 |
two-in-one | 1 |
2-in-one | 1 |
two-in-1 | 1 |
%file_type% | 1 |
hello?world | 1 |
hello ? world | 2 |
☂ ☃ ☀⚤ | 0 |
© %company% | 1 |
01/01/1980 | 3 |
Monday, August 8, 2011 | 4 |
https://ka-graphie.example.com/6d8b.png | 1 |
Let’s look | 2 |
Let's look | 3 (another type of apostrophe is used) |