General Question

ETpro's avatar

How can I recover large text strings Excel changed into # symbols?

Asked by ETpro (34605points) August 29th, 2012

I need to build a CSV file that includes a column of lists of page URLs with space delimiters. The strings average about 70 characters, and in some cases there are as many as 1,000 of them, so a total of something on the order of 71,000 characters including the delimiter space characters.

When there are just a few URLs in a cell, Excel (Office 2007 on Win 7 Enterprise Ed.) handles it just fine. I can see the first URL in the string, and all the rest are there and readable. But some of the rows need to have several hundred or even more than 1,000 URL strings. Excel wouldn’t even allow me to paste these long strings in till I formatted all the cells in the column to Text. It then allowed pasting in the data, but it displayed 255 # characters in a string like #########... Saving the spreadsheet as a CSV, I found that Excel wasn’t kidding. In place of my data, it was putting out 255 character strings of continuous # symbols. It took ages to gather and paste that URL information into the spreadsheet, and the only way to get it again if it is unrecoverable in Excel is to repeat all the searches I did to get it originally.

I do have the spreadsheet with all the apparently corrupted data saved as an XLSX file. Is there any way to recover the corrupted data in it, or must I start over again. If the answer is to start over again, what do I do to avoid the same data corruption on the next try?

Observing members: 0 Composing members: 0

14 Answers

6rant6's avatar

I think I’d use a simple text program like Notepad to open the files and see if the URLs are stored in the file. If not, you are well and truly screwed would be my guess.

tom_g's avatar

Probably one of those “you should have” comments, but…

If I ever work in local files (Word, Excel, text, etc), I use Dropbox, so I can revert back to an earlier version. It can be used as simple source control.

Tropical_Willie's avatar

Can you convert the cells of the Work Sheet to HYPERLINK ? ?

Nullo's avatar

Isn’t that what Excel does when the string is too long for the box?

CWOTUS's avatar

Widen the column in the spreadsheet.

downtide's avatar

If you can’t widen the column, change the column properties to wrap text to fit. However you will then get very deep rows… it will have to be one or the other.

phaedryx's avatar

Could you create the file with a text editor? Sounds like you’re hitting a character limit in Excel.

It sounds like a CSV file isn’t the right format for your data. Is there a reason you need CSV files specifically?

phaedryx's avatar

Just googled, you can’t have more than 32,767 characters in a cell. Perhaps 1,000 URLs could hit that limit?

ETpro's avatar

@6rant6 Already been there with textpad. I was, indeed, “well and truly screwed”.

@tom_g I’m adopting a multiple backup name strategy. I name each copy to indicate its stage in the process. I have terabytes of headroom in local storage, so no need to use Dropbox, but it’s the same idea.

@Tropical_Willie Not a bad idea, but in this case, no. I intended to upload the resulting spreadsheet as a CSV file to populate a database. The final solution, however, turned out to be rather close to that.

@Nullo Now I know that it is.

@CWOTUS That doesn’t help.

@downtide As with @CWOTUS’ suggestion, been there, done that, got the tee shirt. On it is a one word in big letters, FAIL! :-(

@phaedryx Yes, one several strings were in excess of 32,767 character. One was more than 2 times that. The odd thing is that only a few were over the limit, but pasting them in caused cells that were well within the limit to convert to 255 # symbols as well. It degraded data that was within the size limit. So if Excel doesn’t want to accept a paste, be careful about coaxing it.

I needed to use Excel because I had a 3 supplier databases of product IDs with 30 columns of Yes/No parameters to sort on. I needed to sort each spreadsheet 30 times, once by each column + ID to keep the results in alpha/numeric order, then copy the IDs that had a yes for a given parameter.

I pasted them into an Excel spreadsheet. But I could have built a CSV in a text editor. What I ended up doing was pasting the IDs directly into the CMS system’s “Contents” variable for each category page in question. Fortunately, the CMS system involved accepts either a space delimited list of IDs or a carriage return delimited list, so pasting directly from a bunch of rows in the ID column worked just fine. No need to save a CSV and do a database upload.

CWOTUS's avatar

If the file handles small URLs okay, then use TinyURL or an equivalent to condense each one.

6rant6's avatar

That 32,767 limit may count URL encoded characters as 4.

ETpro's avatar

@CWOTUS I wish I could, but these are actually product IDs which end up forming the HTML name of the page, so becoming part of a URL. For SEO purposes, the store owner came up with keyword rich IDs, like “Ungimmicked-Fine-Tip-Sharpie-Black-box-of-12-by-Murphys-Magic-Supplies-Trick” which will become www.domain.com/Ungimmicked-Fine-Tip-Sharpie-Black-box-of-12-by-Murphys-Magic-Supplies-Trick.html. To work when uploaded that text string must be preserved exactly.

No matter. Even if it’s 1 for 1, I am still more than double the limit on some fields.

rodlee's avatar

Strings in VBA are by their nature resizable. All of VBA’s string functions are used to set or retrieve portions of strings of variable length. You may find, however, that you need strings of a specific length, such as when you are writing text-based files that will be parsed by another application, or simply for alignment in list boxes.

ETpro's avatar

Thanks, but I attacked it at the data level. Some of the product descriptions were truly comical combos of HTML markup and local styles. The selection of colors, incredibly huge fonts, and such was hilarious. Apparently the HTML author that did it operated under the theory that if they don’t believe you and do what you want them to do immediately, you just aren’t bellowing loudly enough. I pulled a CSV and without even trying to open it in Excel, I used the industrial strength text editor, TextPad to run a Perl Reg-Ex search that matches all HTML tags and replaces them with ””. Many problems solved in one search and replace.

Answer this question

Login

or

Join

to answer.

This question is in the General Section. Responses must be helpful and on-topic.

Your answer will be saved while you login or join.

Have a question? Ask Fluther!

What do you know more about?
or
Knowledge Networking @ Fluther