Fixing a cell reference when everything else moves

If you write a formula in Excel or Google Docs that includes a cell reference, then that reference will change when you copy the formula across a row or down a column. 

For example: if you type this =A2 in cell B2, then copy it down to B3 and B4, it will become =A3 and =A4 in each of those.

If you copy if across to C2 and D2, the formula will become = B2 and =C2 in those.

You can stop this from happening by using the dollar sign before each part of the cell reference as follows:

=$A$2

Now when you copy the formula it stays the same.

If you only want one part of the formula to change, you can use just one dollar sign like so:

=$A2

(which fixes the column - A - but allows the row - 2 - to change) or

=A$2

(which fixes the row - 2 - but allows the column - A - to change).

When the spreadsheet changes: =INDIRECT

If the spreadsheet itself is changing - inserted or deleted rows, for example - then the dollar sign won't help: $A$2 will still change when a new row or column is added. But there's a function that allows you to do much the same thing: stop cell references getting messed up. With this, instead of typing =A2 or =$A$2 you type the following:

=INDIRECT("A2")

The part in parentheses is a string, so you can compile that any way you want, e.g. 

=INDIRECT("A"&B4)

...if B4 contains the number 2, then this will look at the cell reference "A2", if B4 contains the number 7 then it will look at "A7", and so on.

Finding data based on its proximity to other data: OFFSET

Nick Cahm brought my attention to the =OFFSET function which allows you to fetch data based on a cell reference.

A formula using this looks like this:

=OFFSET(A1, 2, 3, 1, 1)

Where the first value (A1) is your starting point, and the other values tell you where to go from there, like so:

=OFFSET(start at A1, go 2 rows down, then 3 cells across, and grab an area 1 cell high, and 1 cell wide)

Or, put another way:

=OFFSET(range,rows, columns, height, width)

I imagine the formula would be most useful where the first value - range - is determined elsewhere. In other words, you might put a nested formula in there to generate your starting point.

In practical terms, it could be used to find, for example, the next worst (or best) score based on starting from one name (where the results are sorted on that basis), or the 'nearest neighbour' to any particular object.

Other thoughts on potential uses welcome!

Checking if data matches: =EXACT

This one's pretty simple:

=EXACT(A2,B2)

The exact function compares two values and returns TRUE if they match, and FALSE if they don't.

Imagine, for example, you have two datasets and want to check if there are any names used that are not precisely the same (which would cause you problems when trying to match them up).

Sort them both alphabetically so that there's one of each (pivot tables may be a good way to do this), and use the =EXACT function to compare the two lists. You'll still have to manually check each 'FALSE', however, as there may be another match elsewhere in the data if the lists don't contain the same items.

Using COUNTA with CONCATENATE to avoid multiple commas

Aendrew Rininsland started writing this formula to avoid the common problem of repeated commas when you're concatenating an addresses and empty cells leave you with something like this: 

"The House, Bolton Street, , Lancs, , "

His first attempt involved using the IF formula to only concatenate a cell when it contained data. But Excel didn't treat IF in the same way as most programming languages. 

So we tried using COUNTA instead (which returns 1 if a cell contains something, and 0 if it contains nothing), and it worked. Here's the formula:

=concatenate(f2,if(counta(g2),concatenate(", ",g2),""),if(counta(h2),concatenate(", ",H2),""))

And broken down:

=concatenate(f2,

Start with the contents of F2

if(counta(g2)

If G2 contains anything

,concatenate(", ",g2)

Then concatenate, inserting a comma, followed by the contents of G2

,"")

But if not, then insert nothing ("").

,if(counta(h2)

If H2 contains anything, again concatenate, inserting a comma, followed by the contents of H2

,concatenate(", ",H2)

But if not, then insert nothing ("").

,""))

Converting currency on today's rates: =GoogleFinance

One of the strengths of Google Docs is its ability to connect with live information online, and the integration of formulae which play to this strength.

A good example is =GoogleFinance, which can fetch all sorts of information. The most obvious application is currency rates: here's how you can write a formula that automatically converts an amount of money into your own currency based on the latest - or a historical - rate.

The formula to fetch the rate goes as follows:

=GoogleFinance("CURRENCY:AUDGBP")

Some things to note: 
  • Make sure you include quotation marks inside the parentheses.
  • Change the currency codes to the currencies you want to convert from (first) and to (second).
In this example the currencies are Australian Dollars (AUD) and British Pound Sterling (GBP). A full list of codes is available here, as well as on the spreadsheet linked below.

If you wanted to find the rate for converting from pounds to euros, for example, you'd change the formula as follows:

=GoogleFinance("CURRENCY:GBPEUR")

To apply this rate to your data, you can simply adapt it as follows:

=A2*GoogleFinance("CURRENCY:AUDGBP")

(Where A2 contains the amount in Australian dollars)

There's more discussion on this technique on this forum, as currency rates have different aspects, such as bid prices and asking prices. Ahab notes that the formula supports "at least the following 2nd parameter values: "bid" (default wehn omitted), "ask", and "average"."

Using filters to find similar companies in a dataset

Here's an example of how filters can be useful in narrowing down a dataset to relevant companies - as well as avoiding potential mistakes in using =COUNTIF or =SUMIF and counting the same companies twice:

A quick glance at Birmingham Council's expenditure data for November 2011 (CSV) shows a number of payments to taxi companies. But their names vary:
  • Some have 'taxi' in their company name
  • Some have 'cars'
  • And others have 'private hire'
To filter our list we do the following:
  1. Make a copy of the expenditure spreadsheet
  2. Type our filtering criteria in column G as follows (the asterisks are wildcards, meaning that any entries containing that word or phrase will be caught. Without wildcards, it would only look for entries with 'taxi' alone, for example, and not 'Bob's taxis':
    1. G1: Vendor Name
    2. G2: *taxi*
    3. G3: *cars*
    4. G4 *private hire*
  3. Create a new worksheet
  4. While in that empty worksheet, select Data > Filter > Advanced filter...
  5. On the window that appears:
    1. select 'copy to another location'
    2. Click in the 'List range' box, go to the sheet containing the data you want to filter, and select the range of cells you want to filter (or press CTRL+A to auto-select them). This should then read '847527Payments_over_£500_Novemb'!$A:$E'847527Payments_over_£500_Novemb'!$A$1:$E$16423
    3. In 'Criteria range' select the cells you created in G1:G4 on the same sheet - it should then read '847527Payments_over_£500_Novemb'!$G$1:$G$4
    4. In 'Copy to:' select the first empty cell in your empty sheet, i.e.: Sheet1!$A$1
  6. Click OK and the filtered data should appear. You can now perform calculations, pivot tables etc. on this - and check if any irrelevant data has been captured along the way, such as 'Mr Carson' (which would be grabbed under the *cars* criteria). You might also spot other companies you've missed that don't have taxi, cars or private hire in their title, and add to and re-run the filter accordingly.

You can see the finished result here.

Calculating totals for a single company: =SUMIFS

Here's an Excel formula that's an absolute godsend if you want to know how much money was given to a particular company but don't want to use pivot tables.

The SUMIFS formula will add up (SUM) all values that meet a series of conditions (IFs)

Here's an example:

=SUMIFS(D:D,B:B,"BIRMINGHAM LEP COMPANY LTD")

This is adding up all values in column D (Invoice amount) where the value in column B is "BIRMINGHAM LEP COMPANY LTD" (The data I used for this is Birmingham Council expenditure for November 2011)

The formula breaks down as follows:

=SUMIFS(

D:D,

The first thing it needs is the range containing values to be added up (i.e. SUMmed) - in this case, column D.

B:B,

Now we tell it what cells contain the values to be tested (IFs) - in this case, column B.

"BIRMINGHAM LEP COMPANY LTD"

And then we tell it what test to apply. In this case, we're saying 'IF the cell contains BIRMINGHAM LEP COMPANY LTD (we add quotation marks to indicate that this is text) then fetch the value from the relevant cell in column D, and add it to all others that meet the same criterion"

You can add further criteria too, with extra commas - for example:

, D:D, ">1000"

Would restrict the calculation further to those cells where the value in the D column was above 1000.

Finally, we end the formula with a closing parenthesis:

)

That's it. As I say, this is something you can also achieve with pivot tables - the main advantage of =SUMIFS (aside from its speed if you know what you're looking for) is that you can use wildcards

Using wildcards

One of the problems with pivot tables is dirty data where the same organisation may be referred to under slightly different names - for example "BBC" or "B.B.C". Wildcards allow you to be widen your scope to include some of these misspellings (although you're better cleaning up the data).

Here are a couple of examples adapting the formula above:

=SUMIFS(D:D,B:B,"BIRMINGHAM L?P COMPANY LTD")

The question mark indicates any single character will do, so this would not only add up values for BIRMINGHAM LEP COMPANY LTD but also BIRMINGHAM LAP COMPANY LTD, BIRMINGHAM LaP COMPANY LTD, and BIRMINGHAM LEP COMPANY LTD with an accent over the E.

The asterisk wildcard does the same, but allows for multiple characters, for example:

=SUMIFS(D:D,B:B,"BIRMINGHAM LEP*COMPANY LTD")

This would again include BIRMINGHAM LEP COMPANY LTD but also BIRMINGHAM LEP  COMPANY LTD and BIRMINGHAM LEPCOMPANY LTD - in other words where someone has mistyped the space or failed to type it all (it would also take in BIRMINGHAM LEP HOLIDAY COMPANY LTD or anything else between LEP and COMPANY)

By the way, there's another formula - COUNTIFS - which does a similar thing with counting occurrences. More in a future post...

Formatting numbers as text: =TO_TEXT

Here's a formula that came in useful today when trying to visualise time-based data.

Because years are also numbers, sometimes charting tools treat them as values, rather than labels.

Google Docs isn't great at formatting data as text, so here's my solution: =TO_TEXT

Let's say your year labels are in column A.

In cell B2 type the following:
=TO_TEXT(A2)

This will copy the value in cell A2, but make sure it is formatted as text.

The chart should now understand that these are labels, and not values to be included in the chart.

3 posts on using =importXML for scraping webpages

Although I normally publish posts on spreadsheet formulae here, I've been publishing a series of posts on the =importXML formula on the Online Journalism Blog instead. To make it easier to find them, here are the links:

How to scrape webpages and ask questions with Google Docs and =importXML

Asking questions of a webpage – and finding out when those answers change

Scraping data from a list of webpages using Google Docs