When TRIM and CLEAN do not work…
Is it something bothering you?
To fix this problem, we need to understand where is the “leading space” coming come. Before we jump to that, I want to show you two Excel functions first:
CHAR and CODE functions
CHAR returns the character specified by a number. Use CHAR to translate code page numbers you might get from files on other types of computers into characters. (Source: Office Support )
CODE returns a numeric code for the first character in a text string. The returned code corresponds to the character set used by your computer. (Source: Office Support)
|Operating environment||Character set|
|Macintosh||Macintosh character set|
Hmmmm…… Let’s put it this way, computer reads data very differently. When we read some commonly used characters, i.e. a-Z, 0-9, !@$%^, computer may read them as code. For instance, when we read the character A, computer may interprets it as code 65; B as code 66, etc. As a computer user, it doesn’t really matter in most of the time. However, when we are dealing with data including non-printing characters (some characters invisible to us; but mean something to computer), it may create lots of “unknown” puzzle.
By using the CHAR and CODE functions, we can see code of a character, and vice versa.
Try to generate a numeric list from 1-255 and use the CHAR function to see the character it stands for.
Take a look at the following screenshot, which I have filtered “invisible” contents. To us, all non-printing characters and/or space look indifferent, but computer treats them differently.
Having said that they look all indifferent, we see a subtle difference when we concatenate the content from column B with a text string. Some “invisible” characters indeed take up a “space” or two…
And these are where the leading spaces could be coming from.
TRIM and CLEAN come to recuse…
(in most of the cases)
TRIM removes all spaces from text except for single spaces between words. Use TRIM on text that you have received from another application that may have irregular spacing.
Important: The TRIM function was designed to trim the 7-bit ASCII space character (value 32) from text. In the Unicode character set, there is an additional space character called the nonbreaking space character that has a decimal value of 160. This character is commonly used in Web pages as the HTML entity, . By itself, the TRIM function does not remove this nonbreaking space character.
(Source: Office Support)
oooops….Trim the space character (32) only?
How about CLEAN?
CLEAN removes all nonprintable characters from text. Use CLEAN on text imported from other applications that contains characters that may not print with your operating system. For example, you can use CLEAN to remove some low-level computer code that is frequently at the beginning and end of data files and cannot be printed.
Important: The CLEAN function was designed to remove the first 32 nonprinting characters in the 7-bit ASCII code (values 0 through 31) from text. In the Unicode character set, there are additional nonprinting characters (values 127, 129, 141, 143, 144, and 157). By itself, the CLEAN function does not remove these additional nonprinting characters. For an example of how to remove these additional nonprinting characters from text, see Remove spaces and nonprinting characters from text.
(Source: Office Support)
Oops… none of them deals with Character 160.
This is exactly why the TRIM and CLEAN do not work…
So, let’s click the “Remove spaces and nonprinting characters from text” from the office support page, you would expect to see the answer to fix the problem (this was what I expected) but sorry…… hints only.
At the end of the section <Removing spaces and nonprinting characters from text>, you’ll see:
To remove these unwanted characters, you can use a combination of the TRIM, CLEAN, and SUBSTITUTE functions.
You will need to do a little more research and learn the SUBSTITUTE function in order to remove the leading space caused by character (value 160); and it requires you to know nested function.
So I am going to give you the required formula here:
‘where D2 holds the texts containing leading space
Tip: If you are sure that all leading spaces are CHAR(160), then you can shorten the formula to:
'It means substitute all spaces caused by CHAR(160) in D2, with an empty string
As simple as this. Hope you like it. 🙂
To learn more about SUBSTITUTE, you may read my post here.
In rare situations, the non-printing characters could fall anywhere within a text string. Then you may see the code of each character in the text string by using the following formula:
=CODE(MID($A$1,COLUMN(A1),1)) 'copy to the right until you see #VALUE!
where A1 holds the text string you want to examine
You may download a Sample File – Invisible Characters to follow along.