EasyManuals Logo
Home>Technifor>Controller>T08

Technifor T08 User Manual

Default Icon
118 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #11 background imageLoading...
Page #11 background image
11
M_T08_EN_B
Introduction
UTF-8 encoding
UTF-8 was designed to be compatible with certain software initially foreseen for the processing of one-byte
characters. Each 16 bit character is encoded on a chain of 1 to 4 bytes.
UTF-8 is normalised in the RFC-3629 (UTF-8, a transformation format of ISO 10646). Encoding is also de󰘰ned
in the 17 technical report of the Unicode standard. It is part of the standard on chapter 3 "Conformance" and
is approved by the International Standard Organisation (ISO), the Internet Engineering Task Force (IETF) as
well as most of the national standardization organisations.
Encoding
The numbered characters from 0 to 127 are encoded on 1 byte whose most signi󰘰cant bit is always 0.
The characters with a number greater than 127 are encoded over several bytes. In this case, the most
signi󰘰cant bits of the 󰘰rst byte form a series of 1 as long as the number of bytes used to encode the character,
the following bytes having 10 as the most signi󰘰cant bit.
De󰄈nitionofthenumberofbytesused
UTF-8 binary representation Meaning
0xxxxxxx 1 byte coding 1 to 7 bits (from 0 to 127)
110xxxxx 10xxxxxx 2 bytes coding 8 to 11 bits (from 128 to 2047)
1110xxxx 10xxxxxx 10xxxxxx 3 bytes coding 12 to 16 bits (from 2048 to 65535)
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 4 bytes coding 17 to 21 bits (from 65536 to 2097151)
This idea could be applied up to 6 bytes but UTF-8 sets the limit to 4. This idea also allows the use of more
bytes than needed to code a character but the UTF-8 forbids it.
Note: the UTF-8 representation over 4 bytes corresponds to a character code greater than 65535,
which must not be used with the T08 program.
Example
Example of the UTF-8 encoding
Character Character number UTF-8 binary encoding
A 65 01000001
é 233 11000011 10101001
8364 11100010 10000010 10101100
In any UTF-8 character string, any 0 most signi󰘰cant bit byte encodes a US-ASCII character on a byte. The
characters whose codes are included between 0 and 127 are therefore represented the same way as in ASCII
(non-accentuated, capital and small letters, numbers and some frequent initials).

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Technifor T08 and is the answer not in the manual?

Technifor T08 Specifications

General IconGeneral
BrandTechnifor
ModelT08
CategoryController
LanguageEnglish

Related product manuals