Newbe Dev Stack

Posts

Showing posts with the label Unicode

Color For Unicode Emoji

September 28, 2015

Answer : Yes, you can color them! div { color: transparent; text-shadow: 0 0 0 red; } <div></div> Not every emoji works the same. Some are old textual symbols that now have an (optional or default) colorful representation, others were explicitly (only) as emojis. That means, some Unicode codepoints should have two possible representations, text and emoji . Authors and users should be able to express their preference for one or the other. This is currently done with otherwise invisible variation selectors U+FE0E ( text , VS-15) and U+FE0F ( emoji , VS-16), but higher-level solutions (e.g. for CSS) have been proposed. The text-style emojis are monochromatic and should be displayed in the foreground color, i.e. currentcolor in CSS, just like any other glyph. The Unicode Consortium provides an overview of emojis by style (beta version). You should be able to append ︎ in HTML to select the textual variant with anything in the columns labeled “De...

C++: Printing ASCII Heart And Diamonds With Platform Independent

August 20, 2013

Answer : If you want a portable way, then you should use the Unicode code points (which have defined glyphs associated to them): ♠ U+2660 Black Spade Suit ♡ U+2661 White Heart Suit ♢ U+2662 White Diamond Suit ♣ U+2663 Black Club Suit ♤ U+2664 White Spade Suit ♥ U+2665 Black Heart Suit ♦ U+2666 Black Diamond Suit ♧ U+2667 White Club Suit Remember that everything below character 32 in ASCII is a control character . They have a meaning associated with them and you don't have a guarantee of getting a glyph or a behavior there (even though most control characters to have glyphs, although they were never intended to be printable). Still, it's not a safe bet. However, using Unicode needs proper font and encoding support which may or may not be a problem on UNIX-likes. On Windows at least some of the above code points map to the ASCII control character glyphs you're outputting if the console is set to raster fonts (and therefore not supporting Unicode or anything else th...

ASCII Vs Unicode + UTF-8

February 02, 2013

Answer : In modern times, ASCII is now a subset of UTF-8, not its own scheme. UTF-8 is backwards compatible with ASCII. Yes, except that UTF-8 is an encoding scheme. Other encoding schemes include UTF-16 (with two different byte orders) and UTF-32. (For some confusion, a UTF-16 scheme is called “Unicode” in Microsoft software.) And, to be exact, the American National Standard that defines ASCII specifies a collection of characters and their coding as 7-bit quantities, without specifying a particular transfer encoding in terms of bytes. In the past, it was used in different ways, e.g. so that five ASCII characters were packed into one 36-bit storage unit or so that 8-bit bytes used the extra bytes for checking purposes (parity bit) or for transfer control. But nowadays ASCII is used so that one ASCII character is encoded as one 8-bit byte with the first bit set to zero. This is the de facto standard encoding scheme and implied in a large number of specifications, but strictly s...

Byte String Vs. Unicode String. Python

February 11, 2001

Answer : No python does not use its own encoding. It will use any encoding that it has access to and that you specify. A character in a str represents one unicode character. However to represent more than 256 characters, individual unicode encodings use more than one byte per character to represent many characters. bytearray objects give you access to the underlaying bytes. str objects have the encode method that takes a string representing an encoding and returns the bytearray object that represents the string in that encoding. bytearray objects have the decode method that takes a string representing an encoding and returns the str that results from interpreting the bytearray as a string encoded in the the given encoding. Here's an example. >>> a = "αά".encode('utf-8') >>> a b'\xce\xb1\xce\xac' >>> a.decode('utf-8') 'αά' We can see that UTF-8 is using four bytes, \xce, \xb1, \xce, and \xac to repr...