Sunday 10 June 2018

A brief look at Excel Dates & Times

The mistake is right there in the title - Thinking of dates and times as different things, or maybe thinking of a time as occurring within a date like a file inside of a folder, they're actually intrinsically linked together and should be used as so. Do you ever do things with dates and times and stumble over them or force them to be pure texts because "Excel keeps changing them!". Well this post is for you.

The Problem

So you're working with a moment in time and you take the naive route of having one column for your dates, and a second column next to it for your times. You end up with something looking like:

Date Time
2018-01-01 08:00:00

And you see nothing wrong with that. Quite reasonable you think. So you move on without realising you've damned yourself and the next person who has the misfortune of maintaining this disaster-in-progress.

So what's the problem? Well eventually lets say you need some durations. A pretty simple calculation, just one time minus the other. So it'll look like:

Start Date Start Time Stop Time Stop Time - Start Time
2018-01-01 08:00:00 18:00:00 10:00:00
2018-01-02 15:00:00 17:00:00 02:00:00
2018-01-03 23:00:00 02:00:00 ####################

Ok great! Now we have our dura-... Wait a second what happened to the duration for the 3rd? Well Excel in it's default settings doesn't like negative dates and simple refuses to display them, it'll show a bunch of hashes instead. "But it's not negative!" I hear you scream out. Well it is. 02:00:00 is before 23:00:00. A number minus a larger number is negative. Deal with it.

Well we can fix this can't we? Just rework our formula a bit. We can do this with an if statement like:

='Stop Time' - 'Start Time' + IF('Stop Time' < 'Start Time', 1, 0) 

Sort of. It'll work if your moments can be no more than 24 hours apart but then it'll break down and it'll be tricky to notice because there won't be an error - The formula isn't at fault - Your logic is.

So we change the formula a bit to account for the dates. To do this we need to find the different between the times, then the different between the dates and add them together. Ridiculous I know, but this is the solution I've seen used.

=('Stop Time' - 'Start Time') + ('Stop Date' - 'Start Date')

Well if it stupid but it works, it's not stupid right? Wrong. That sound you hear is the current maintainer of this mess banging on your door and they're out for blood.

The Solution 

Just don't split them and you can do:

='Stop Moment' - 'Start Moment'

Job done. Drinks all round. You also get the added bonus that these will now sort properly without having to sort on two columns. You're probably not happy with this though, it solves one problem and presents another - Now you want to reference "Just the date" bit or "just the time" bit. Well there are two formula that can help you here.

Just the date:
=Int('Some Moment')

Just the time:
=Mod('Some Moment', 1) 

They're a bit confusing at first - But you've done something clever by getting this far, you've started to understand what's actually going on. Know thy enemy and all that. Don't be afraid to make a series of extra calculated columns that do these bits for you - Calculations are cool, just don't split up your actual data!

Why this works

Now for the interesting bit - What is Excel actually doing to our dates and times? When I type a date or a time in, Excel sometimes makes it look different! Well that's because it recognises a Date or Time (Or DateTime/Moment) and tries to help you out, and you, being the ungrateful swine you are are messing with it's grand plan.

So Excel stores Dates as whole numbers as of  a certain date, the "start date" isn't really important. You can refer to this "zero" date as the Excel Epoch if you want to sound cool. But it won't come up in conversation much. If it does then find some more interesting friends.

If you type your date in and convert it's format to "General", it'll show you the underlining value. It'll be like something like "42867". That's 42,867 days since the epoch date. Sounds over-designed right? Well computers find it much easier to use numbers than to disseminate a formatted bunch of characters every time it needs to do math - The reverse of this is if you type a number into a cell and change the format to date it'll show you what date that many days from the epoch represents. Put in zero and you'll get 1900-01-00. That's not even a real date. The answer here is "Don't think about it".

So dates are whole numbers. Your stop date might be 40,001 and your start date is 40,000. Minus the two of them and you end up with 1. Which is how many days between them. Super simple stuff. Here's a table:

Date Format Numeric Format
1900-01-00 0
1900-01-01 1
1900-01-02 2
...
2018-06-10 43259
2018-06-11 43260

So on to times. The simple explanation here is that if whole numbers are whole dates, partial numbers are partial dates. We call those partial dates "Times". Imagine a clock face, when the hand points to 6am (06:00:00), it's 25% of the way through the day - Excel sees it the same and stores this as "0.25". So to extend our dates example:

Date Format Numeric Format
2018-06-10 00:00:00 432590
2018-06-10 06:00:00 432590.25
2018-06-10 12:00:00 432590.5
2018-06-10 20:43:25 432590.863483796296296

So with this we can go back to our earlier problem. We need the durations between two moments that cross midnight. Combining the two looks like:

Date Format Number Format
Start Moment 2018-01-01 23:00:00 43099.958333333333333
Stop Moment 2018-01-02 02:00:00 43100.083333333333333
Difference 1900-01-00 03:00:00 0.125

The "1900-01-00" part is weird, but we usually discard this by just formatting a time. If you formatted as date you'd see it! But anyway, you can see by keeping our values linked together it makes it a lot easier to calculate off the back of them. That's a good thing. No matter how clever you are or think you are, this spreadsheet you're going to create will become a confusing mess. Try and make it less of a mess as possible!

Time Spans

So here am I prattling on about how times and dates are the same thing, but then I show you a difference and say "Discard the date part". I'm not a hypocrit, we've just created another type of value called a "Time Span". Time spans are not very obvious in Excel, I don't think it mentions them anywhere - Why am I mentioning it? Well because what happens if you have more than 24 hours?

Lets say you minus 2018-01-01 from 2018-01-03. You'll end up with 2. Or 1900-01-02. Format that as a time and you end up with "00:00:00". That's not right. We have 48:00:00. How do I make Excel do that?

The answer is to format it as "[h]:mm". [h] is a special format for "Total hours". Plug this bad boy in and it'll display what you want. You're welcome.

Wait what about those formulae you mentioned earlier?

Oh yeah, Int and Mod. These are actually really simple. Int returns only the "Whole" part of the number, so 42568.25 becomes 42568. Now it's a whole date. Mod is a little bit more complex in that it divides your number and returns the remainder. So 1 will divide into 45678.25 45,678 times and leave a remainder of 0.25. 0.25 is your "time part" and really means 06:00:00. Simple.

Closing thoughts

This got way more in depth than I really wanted and I've probably not done it much justice. You might still be thinking "Why would I want to do this? It's simpler for me to just split them up". Well you're making your job harder. If your record has 2 moments you're now juggling 4 values which lose some of their meaning unless they're combined - So you might as well just leave them combined. Comparison will also work better and VBA works very similarity so it'll give you a head-start there as well!

Get your head around the concepts discussed here and you'll have a munch easier time dealing with time I promise you.

I might revisit this at a later date or create a "Working with datetimes" guide/rant. For that to help you make sure you're not seperating the two values!

Thanks for reading!

New Beginnings

Hello Internet!

I've decided to start a blog, there are two reasons.

The first is to share some of my experiences in the tech world which hopefully someone out there might find useful. I've come across a lot of shit in my time, the sort of shit we've all had to deal with at one time or another; confusing documentation, indecipherable requests and unworkable problems. I think some of it is quite interesting but unfortunately (Or fortunately, I've not decided yet) it's all stuck in my head.

The second reason is a bit more seflish - I've always wanted to write (I love Sci-fi!) but have never got up the courage to actually start, and when you want to start something that's literally the best thing you can do. Inaction breeds inaction afterall. Blogging is probably a bit different to Sci-fi but I'm hoping it might give my writing style a bit of a kick.

So there it is, maybe nobody will ever read this and that's fine too. Just pretend I'm giving a thumbs up right now and please excuse the oncoming cringe.

I think I'll try and do a post a week, I might have to scrape the bottom of the barrel to do so but there might be something good down there. Good luck and godspeed to us all.

- Richard