Work Around and description of Excel Empty Value problem
It is common to use Microsoft Excel to save tables as tab delimited text files. Sometimes tables will have rows that END with cells with empty values In some cases like this, Excel creates a file that does not include the delimiters for these cells. The first 16 rows may include the delimiters because the header row does not contain missing values. But after 16 rows it was observed that the new line starts with out delineating where the cells with empty values are.
from web search:
Why do I get the error "wrong number of values" when I use insheet to read data from Excel?
Title How Excel writes empty cells
Author Paul Lin, StataCorp
Date April 1997
This problem has to do with how Excel writes empty cells into files. Pretend that a row of your spreadsheet reads
(1) (2) (3) (4) (5) (6) (7) (8) (9)
---------------------------------------------------
| | | | | | | | | |
| 136 | 9.8 | | 64 | 108 | | | 7.8 | |
| | | | | | | | | |
---------------------------------------------------
Excel would write this in a file as
136,9.8,,64,108,,,7.8,
and Stata can read this without difficulty. Note the comma at the end of the line. It is because of that comma that Stata knows there is a ninth value; it is merely that the ninth value is missing in this observation. Stata understands this.
Sometimes, however, Excel will write these same data as
136,9.8,,64,108,,,7.8
without the trailing comma. Thus, it appears to insheet that the line contains eight rather than nine values, and so insheet complains.
The conditions under which Excel does this are complicated. Basically, Excel reads a chunk of the spreadsheet and then looks at the chunk. If the last column in the chunk contains empty cells, then it omits that column! So, in the resulting file, there may be some records showing nine columns, others showing eight, and even others showing seven or fewer! Each will occur in a group (that is determined by Excel's buffer size).
If you run into this difficulty, here is what you do:
1. To your spreadsheet, add another column. Fill the columns with 1s.
2. Save the spreadsheet.
3. Read the spreadsheet into Stata using insheet.
4. Drop the last variable.
Adding a column of 1s will avoid the problem of empty cells in the last column, so Excel will write the correct number of separators on each line.
http://www.stata.com/support/faqs/data/1excelcol.html
The above work around of adding a column at the end of the table with values sounds like it could work. I like this one better. I used OpenOffice.org Calc to convert an Excel file into a tab deliniated file (save as Text CVS and then use Edit filter settings). Doing a "save as" with Calc creates a file that includes all of the delimiters even with cells with empty values at the end of rows. A comparison can be made using a binary file editor (bvi) to look at the files. Opening them in Microsoft Word and doing a "show/hide" shows the delimiters and new line feed too.
Morpho was unable to delete some of the columns, near the end, with the Excel problem file. I replaced the text file that Excel had created with the one that Calc had created. Morpho was able to delete columns. Before this I could not delete columns. The response time working within the table view in Morpho was better too.