Combine Two or More Spreadsheets Together with Lookup Functions: Example 1*
Datasets come in a variety of formats – from MS Access, from SPSS, or from CSV files. The first step in the data analysis process is to import, export, and merge multiple datasets together into a master dataset that you can use for further analyses.
Lookup functions are for merging and combining datasets, like when you’ve got program participants’ demographic data in one place and their scores or outcomes in another place. I can’t tell you how many times I’ve watched people waste hours of their time by trying to merge these spreadsheets manually (e.g., by copying and pasting back and forth). You’ve got better things to do with your time, don’t you? Lookup functions to the rescue!
There are two primary versions of lookup functions: vertical lookup and horizontal lookup. Most of the time, you’ll be using a vertical lookup. Use a vertical lookup or vlookup when unique ID numbers are listed vertically down the first column of your spreadsheet—when each entry in your spreadsheet is located in its own row. Use a horizontal lookup or hlookup when unique ID numbers are listed horizontally across the first row of your spreadsheet—when each entry in your spreadsheet is located in its own column.
A disclaimer: vlookup and hlookup have learning curves. I first learned this function from a teammate back in 2008. Bless her heart, she explained and re-explained the function to me a million times. I practiced on my own, and finally–probably a full month later–I got it. So don’t worry if this function seems tricky at first. Eventually everything will click in your brain, and you’ll be on your way to saving hours and hours of your time.
The function contains four pieces of information. It looks like this: =vlookup(lookup_value,table_array,col_index_num,range_lookup).
And here’s what each piece of the function really means:
- lookup_value: The cell that contains the person’s ID number. The ID numbers are the link or key that connect all the spreadsheets together. The ID numbers must be located in the first column of each table – in the first column of your new combined dataset and in the first column of every single table from which you’re pulling data.
- table_array: This is the table from some other spreadsheet or other file from which you’re pulling data.
- col_index_num: This is the column in that table where you’re pulling the data from. Just type in the number of the column. For example, if you want to pull in data from Column C of another table, you’d type “3.”
- range_lookup: Always type “false” and you’ll be in good shape.