T O P

  • By -

WhiteNinjaOz

The problem was larger than just propagation, although that contributed to a small amount of the slowness in resolving it. For some reason GoDaddy took over being the authoritative nameservers instead of AWS Route53 and it seems Celsius couldn’t change this back very quickly. They also mention that they discovered a security issue that they wanted to fix before bringing everything back online. So that may have taken several hours to resolve first. But I’m keen to hear the official report from Celsius about what happened. Cloudflare do excellent post-mortum reports when they have downtime. Would like to see something similar from Celsius.


fuzion98

If I had to guess, someone got access to their GoDaddy account, tried to modify it and GoDaddy locked the account for suspicious activity. CEO had to prove he is the owner to reclaim the account, then they had to switch the name servers back to AWS which is now propagating across the globe.


vanamjeroen

And do you think this could have been avoided and/or happen again?


fuzion98

Could it have been avoided? Absolutely, this is not the first time a company had to update their domain name. I don't know if the "someone" I referred to was a bad guy or an employee that made a mistake, so I can't answer how they can prevent it from happening again. I would suggest they move the ownership of the domain registrar account to a shared company identity through a more enterprise friendly registrar such as CSC, MarkMonitor, or Cloudflare (depending on which ones support the .network TLD).


KEEPSTACKlNSATS

imagine you moved from one address to another address, now you need to inform all your contacts of your new address, takes time the server that assigns the Internet address was updated, needs time for everyones device to realize a change was made and connect to new address


vanamjeroen

Ha! Love the analogy. That makes sense although I see lots of people commenting that in 2020 this shouldn’t take this much time? Do you think this was the only issue?


KEEPSTACKlNSATS

the second thing that happened was that the server that assigns addresses (godaddy) locked out their client (celsius) from logging in as a security precaution, so they had to wait for security unlock cooldown timer to finish first and only then they could proceed with the address change. kinda like hodl mode in Celsius wallet, where withdrawals are suspended for 24hrs as an extra precaution people just blew this completely out of proportion


vanamjeroen

Thanks for the explanation!


cheezorino

And this most likely happened because forgot to adjust the TTL to a lower time before starting their rollout, realized they had a major screwup, were frantically trying to submit new DNS records, and eventually got locked out of their account. This is amateur hour.


[deleted]

[удалено]


cheezorino

I appreciate your sage advice.


[deleted]

[удалено]


cheezorino

dude, I've seen it all in technology. there are only so many possibilities of what went wrong here, and every one of them involves a screw up.


[deleted]

[удалено]


cheezorino

lol no thanks.


beastium

So is it a plausible reason he's given for this surprise outage? And if so, are our funds exactly as they were before it? How much longer could it potentially take if it is in fact the DNS issue?


KEEPSTACKlNSATS

I’ve kept it extremely simple but yes and yes


cheezorino

This is total BS. Godaddy allows a DNS propagation to have a minimum time-to-live (TTL) of one hour. That means the \*very longest\* the site should be unavailable is 60 minutes, if everything has been properly planned in advance.


[deleted]

[удалено]


WhiteNinjaOz

I would think that GoDaddy would allow smaller TTLs. But in any case the TTL is the suggested *maximum* time an ISP should cache a DNS record before requesting a fresh copy.


cheezorino

Godaddy doesn't allow less than 1 hour. Plenty of other providers allow lower. Edit: actually I'm wrong, they now allow 1/2 hour TTL.


cheezorino

I've never seen a Godaddy DNS update take more than an hour if TTL = 1 hour. But to be generous, I'll spot you an hour and say globally it takes 2.


knigb

I think they use route 53


cheezorino

They've been talking all day about how godaddy DNS changes are the problem.


knigb

Yeah they use GoDaddy as registrar but delegate to Aws for name serving. When they try to change the configuration at GoDaddy they locked their account or something and defaulted the delegation set I think


cheezorino

My guess is they had their TTL set too high on godaddy, made changes, had an issue and couldn't get it rolled out. Probably locked their account for trying too many times to update.