I am currently studying some Windows Registry accesses from Excel & Co. A formula in an Excel sheet helps me to split these entries into its components; meaning the path elements and the value. The location where a file is stored can be addressed in a similar manner like a registry key. So, I had the idea to write this article and use my formula for splitting a file location in its path and file name.

The storage location of a file usually starts with a drive letter specification followed by a series of folder names separated by separator and ends with the filename. For a Windows operating system, the separator is a backslash.

Depending on where the file is located, the path is sometimes shorter, sometimes longer. To retrieve the file name with an Excel formula, we have to know the position of the last separator in the path. Then, we will be able to use the Excel function “Right()” and extract the file name.

The number of occurrences of a string in another string can be determined relatively easy in Excel. We have just to replace the searched string by an empty string and then calculate the length of the resulting string. If we subtract this value from the length of the original string and divide this result by the length of the searched string, we will exactly get the number of occurrences. If the length of the searched string is one character, we can use:

D7 = LEN($D$3)-LEN(SUBSTITUTE($D$3;$D$4;""))

The separator is stored in cell D4 and the location of the file in cell D3; in the figure above “C:\Users\Mourad\Documents\Sample.xls”.

The formula returns “4”, this corresponds exactly the count of backslashes in the text. The Excel function “Substitute()” allows to replace a given text by another and also allows in its last argument to specify which occurrence should be replaced. So, the formula should look like:

D9 = SUBSTITUTE("C:\Users\Mourad\Documents\Sample.xls"; "\"; "$$$"; 4)

If we replace the static values by the cell references, we obtain:

D9 = SUBSTITUTE($D$3;$D$4;$D$5;$D$7)

As you can see in the screenshot above, cell D9 now contains the value “C:\Users\Mourad\Documents$$$Sample.xls”. Of course, we can use any other text instead of “$$$”; the only condition is that the chosen text is not already present in the path and file name.

Now, it is very easy to get the path and the file name. The function “Find()” returns the position of a searched text in another text. We can use this position as arguments for the “Left()” and “Right()” functions:

D11 = LEFT($D$9;FIND($D$5;$D$9)-1)

D13 = RIGHT($D$9;LEN($D$9)-LEN($D$11)-LEN($D$5))

The first variant for getting the filename uses the result of the path extraction, if we would like to retrieve the file name independently from the path, we can use:

D13 = RIGHT($D$9;LEN($D$9)-FIND($D$5;$D$9)-LEN($D$5)+1)

#### Creating a formula without the use of helping cells

If a formula is needed, which directly extracts the path and file name and does not make use of the help cells, the first thing we can do, is to substitute the references to the help cells by the formulas. The formula for retrieving the path will then be:

D18 = LEFT(SUBSTITUTE($D$3;$D$4;$D$5;LEN($D$3) - LEN(SUBSTITUTE($D$3;$D$4;"")));FIND($D$5; SUBSTITUTE($D$3;$D$4;$D$5; LEN($D$3)-LEN(SUBSTITUTE($D$3;$D$4;""))))-1)

The first argument of the function “Left()” contains the formula for substituting the “\” with “$$$”. However, as the last backslash is searched, it does not matter if we use the substituted file location or the file location itself. So we can replace the formula in the first argument of “Left()” by the file location:

D19 = LEFT($D$3;FIND($D$5; SUBSTITUTE($D$3;$D$4;$D$5;LEN($D$3) - LEN(SUBSTITUTE($D$3;$D$4;""))))-1)

We can build the formula for extracting the file name in the same way like mentioned above. The result is, in a first step, a complex formula:

D21 = RIGHT(SUBSTITUTE($D$3;$D$4;$D$5;LEN($D$3) - LEN(SUBSTITUTE($D$3;$D$4;""))); LEN(SUBSTITUTE($D$3;$D$4;$D$5; LEN($D$3)- LEN(SUBSTITUTE($D$3;$D$4;""))))-FIND($D$5; SUBSTITUTE($D$3;$D$4;$D$5; LEN($D$3)- LEN(SUBSTITUTE($D$3;$D$4;""))))-LEN($D$5)+1)

This formula can also be optimized. Again, we can directly use the file location instead of the file location with the replaced text. This leeds to:

D22 = RIGHT($D$3;LEN($D$3)-FIND($D$5; SUBSTITUTE($D$3;$D$4;$D$5;LEN($D$3)- LEN(SUBSTITUTE($D$3;$D$4;""))))-LEN($D$5)+1)

WHowever, we have not to subtract the length of the replacement text, as we are now using the original file location. The final formula is then:

D22 = RIGHT($D$3;LEN($D$3)-FIND($D$5; SUBSTITUTE($D$3;$D$4;$D$5;LEN($D$3)- LEN(SUBSTITUTE($D$3;$D$4;"")))))

At this point, thanks to Andreas, who provided some tips for me. And it is certainly possible to extend the formulas; for example by checking, if the function “Find()” results in an error or if the replacement text is already contained in the file location.

#### Using VBA to retrieve the path and filename

The path and the file name can also be retrieved by using the following two VBA functions:

Update on 09/21/2011: I had overseen that the new layout of the blog had an impact on some tags in this post. Therefore the formulas were only dispayed in one long line without any automatic breaks. I changed this and manually added some breaks.

This article has also been published on the German Version of the Excel Ticker blog.

Hi Susan…

Thanks, however, I am not sure, if I had completely understood the request. And I am also afraid, that some text missing at the end of your comment; may be it was cut by the blog software.

I can offer you to e-mail me at support(at)maninweb(dot)de and I will look, what I can do for you. Is it perhaps possible to attach a sample file, so that I can better retrace the steps you have done?

Best Regards 🙂