Difference between revisions of "Server status"

From EtherWiki
Jump to: navigation, search
(2007 November)
 
(24 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
__TOC__
 
__TOC__
  
== 2007 November ==
+
== 2012.07 ==
 +
 
 +
Unrelated to the [http://en.wikipedia.org/wiki/Leap_second#Examples_of_problems_caused_by_the_leap_second leap second debacle] that the Internet experienced last night, the ether server had some time zone issues today. Fixes were put into place and will be monitored.
 +
 
 +
--[[User:Sstrader|Sstrader]] 17:43, 1 July 2012 (EDT)
 +
 
 +
== 2011.11 ==
 +
 
 +
All content has been moved off Lunarpages and should be more responsive. This wiki has moved to the scottdstrader.com domain. Old urls will eventually be redirected.
 +
 
 +
16:34, 21 November 2011 (EST)
 +
 
 +
== 2011.06 ==
 +
 
 +
[[RadioWave]] has been moved off of [[Lunarpages]] and is back up under its old domain name.
 +
 
 +
--[[User:Sstrader|Sstrader]] 10:49, 5 June 2011 (PDT)
 +
 
 +
== 2011.05 ==
 +
 
 +
After relocating the databases last month based on complaints from [[Lunarpages]] excessive database requests, we were back up for a week before they took the web sites down again because of excessive web requests. Most of those requests appear to be coming from search engines, but until I block those [[RadioWave]] will be offline.
 +
 
 +
--[[User:Sstrader|Sstrader]] 13:39, 13 May 2011 (PDT)
 +
 
 +
I'm working on getting [[RadioWave]] hosted elsewhere. It should be back up by early next week.
 +
 
 +
--[[User:Sstrader|Sstrader]] 17:55, 19 May 2011 (PDT)
 +
 
 +
I've mirrored RadioWave [http://rw.scottdstrader.com/ScheduleServlet?view=current_grid here] temporarily.
 +
 
 +
--[[User:Sstrader|Sstrader]] 19:48, 26 May 2011 (PDT)
 +
 
 +
== 2011.04 ==
 +
 
 +
Starting at sometime on 20 Apr 2011, the web hosting company took the user profile database offline because of excessive connections. I'm working with them to get it back up. Until then, visitors will not be able to log in and will see all times in GMT/UTC.
 +
 
 +
Although the issue appears to stem from the [[Information on the "500 Servlet Exception" error message|PermGen issue]] seen during redeployments, I made a few additional changes:
 +
 
 +
# Fixed an exception being thrown when invalid IDs are passed to the event permalinks.
 +
# Introduced code to throttle login attempts based on frequency. This should mitigate the load from any scripts that are attempting dictionary/rapidfire attacks on the site. This technique is taken from [http://stackoverflow.com/questions/570160/throttling-login-attempts Throttling login attempts] and [http://www.codinghorror.com/blog/2009/01/dictionary-attacks-101.html Dictionary Attacks 101].
 +
# Blocked the three most active IP addresses: 1.202.221.2 bjtelecom.net, 66.249.68.70 googlebot, and 220.181.108.165 baidu.com.
 +
 
 +
--[[User:Sstrader|Sstrader]] 06:42, 22 April 2011 (PDT)
 +
 
 +
After daily tweaks, the sites ended up with these numbers:
 +
 
 +
<table class="wikitable">
 +
<tr>
 +
<th></th><th>04/22/11</th><th>04/27/11</th><th>Diff %</th>
 +
</tr>
 +
<tr>
 +
<td>[[RadioWave]]</td><td>50083</td><td>2800</td><td>0.06</td>
 +
</tr>
 +
<tr>
 +
<td>[[EtherTV]]</td><td>598</td><td>24</td><td>0.04</td>
 +
</tr>
 +
<tr>
 +
<td>[[EventNett]]</td><td>8467</td><td>580</td><td>0.07</td>
 +
</tr>
 +
<tr>
 +
<td>EtherWiki</td><td>283</td><td>233</td><td>0.82</td>
 +
</tr>
 +
<tr>
 +
<td>MySQL</td><td>3.12</td><td>2.43</td><td>0.78</td>
 +
</tr>
 +
</table>
 +
 
 +
All hits decreased considerably except this wiki. [[Lunarpages]] only supports PHP 4.4.9, so the MediaWiki install can't be upgraded. Researching if 10 hits per hour can cause high MySQL loads.
 +
 
 +
--[[User:Sstrader|Sstrader]] 15:45, 28 April 2011 (PDT)
 +
 
 +
Databases have been relocated to avoid any issues with [[Lunarpages]]. Sites should be 90% back up with frequent monitoring over the next week.
 +
 
 +
--[[User:Sstrader|Sstrader]] 17:06, 7 May 2011 (PDT)
 +
 
 +
== 2010.04 ==
 +
 
 +
Over the past few days, the server has been slow to respond and the [[Information on the "500 Servlet Exception" error message|PermGen space exception]] has returned. Researching.
 +
 
 +
--[[User:Sstrader|Sstrader]] 13:29, 29 April 2010 (PDT)
 +
 
 +
Put in support ticket with [[Lunarpages]].
 +
 
 +
--[[User:Sstrader|Sstrader]] 15:08, 29 April 2010 (PDT)
 +
 
 +
Resolved. Lunarpages support replied that they didn't see anything wrong with the server and response time immediately went back to normal.
 +
 
 +
--[[User:Sstrader|Sstrader]] 17:17, 29 April 2010 (PDT)
 +
 
 +
== 2009.10 ==
 +
 
 +
[[RadioWave]] is experiencing the time zone issues again. Appeared around two days ago. Opened [https://support.lunarpages.com/tickets/view/1634058 ticket #1634058] with [[Lunarpages]]. Description:
 +
 
 +
:I'm seeing some inconsistencies in how the current time is reported in my JSPs. Using this test page:
 +
:
 +
:http://www.radiowavetuner.com/currenttime.jsp
 +
:
 +
:I print out the current time in all time zones, with each linking to a site that reports the current time in that time zone. Some appear correct (GMT, UTC, US/Arizona, EST) but others incorrect (EST5EDT, PST8PDT). On the surface it appears related to daylight savings. This divergence in actual and expected results began in the last couple of days.
 +
:
 +
:I appreciate any suggestions.
 +
 
 +
JVM version is 1.5.0_01-b08.
 +
 
 +
--[[User:Sstrader|Sstrader]] 20:08, 27 October 2009 (PDT)
 +
 
 +
Fixed by Lunarpages:
 +
 
 +
:This issue occurs because JDK contains its own internal timezone data and does not rely on the system timezone data (which is already correct). I have updated the JDK timezone data and all timezones should be displaying the correct times. Please let us know if you have any questions or issues regarding this matter and we'll be happy to assist you.
 +
 
 +
--[[User:Sstrader|Sstrader]] 12:51, 28 October 2009 (PDT)
 +
 
 +
== 2009.07 ==
 +
 
 +
Yesterday at around 18:00 EST, the [[Lunarpages]] servers and hosting went down, so all Ether sites went down too. They had [http://www.lunarforums.com/lunarpages_webhosting_help/san_diego_dc_issues_july_14th-t52947.0.html an issue with their San Diego datacenter] but were back up in a couple of hours.
 +
 
 +
--[[User:Sstrader|Sstrader]] 07:02, 15 July 2009 (PDT)
 +
 
 +
== 2009.05 ==
 +
 
 +
The issue appears to have been caused by an incomplete code deployment; by midday, it was resolved. I'll keep an eye on any possible relapse.
 +
 
 +
--[[User:Sstrader|Sstrader]] 21:00, 21 May 2009 (PDT)
 +
 
 +
Updates to [[RadioWave]]'s time zones processing code aren't playing well on the host's server. A fix should be available in the next day or so.
 +
 
 +
--[[User:Sstrader|Sstrader]] 21:34, 20 May 2009 (PDT)
 +
 
 +
== 2009.03 ==
 +
 
 +
After the time change for daylight savings, several issues appeared in [[RadioWave]]:
 +
 
 +
* Setting your account time zone to [http://www.convertunits.com/time/zone/EST5EDT EST5EDT] does not reflect the hour change (appears an hour behind). This is probably due to the [[web hosting]] servers missing the [http://java.sun.com/javase/tzupdater_README.html Timezone Updater]. This is being researched.
 +
* Several stations chose this time to update thier schedule format and so their events are not getting imported into the RadioWave database. This should be fixed soon.
 +
 
 +
--[[User:Sstrader|Sstrader]] 07:15, 11 March 2009 (PDT)
 +
 
 +
Updates to the issues above:
 +
 
 +
* The time zone issue resolved itself after a few weeks, so this is probably an unpatched JRE.
 +
* Schedule parser updated to accommodate new formats of the public broadcasting stations (KALW, WABE, WCLK, WFCR, WHQR, WXXI-FM) hosted at http://www.publicbroadcasting.net.
 +
 
 +
--[[User:Sstrader|Sstrader]] 07:12, 8 April 2009 (PDT)
 +
 
 +
== 2008.06 ==
 +
 
 +
Getting <code>NoClassDefFoundError</code> exception on [[RadioWave]] JSP pages:
 +
 
 +
500 Servlet Exception
 +
java.lang.NoClassDefFoundError
 +
    at _RadioWave__jsp._jspService(/RadioWave.jsp:19)
 +
    at com.caucho.jsp.JavaPage.service(JavaPage.java:75)
 +
    at com.caucho.jsp.Page.subservice(Page.java:506)
 +
    at com.caucho.server.http.FilterChainPage.doFilter(FilterChainPage.java:182)
 +
    at com.caucho.server.http.Invocation.service(Invocation.java:315)
 +
    at com.caucho.server.http.CacheInvocation.service(CacheInvocation.java:135)
 +
    at com.caucho.server.http.RunnerRequest.handleRequest(RunnerRequest.java:346)
 +
    at com.caucho.server.http.RunnerRequest.handleConnection(RunnerRequest.java:274)
 +
    at com.caucho.server.TcpConnection.run(TcpConnection.java:139)
 +
    at java.lang.Thread.run(Thread.java:595)
 +
 +
Resin 2.1.13 (built Thu Apr 1 10:57:42 PST 2004)
 +
 
 +
Working for [[EventNett]]. Different from the [[Information on the "500 Servlet Exception" error message|PermGen space exception]].
 +
 
 +
Ticket #821813 opened at [[Lunarpages]].
 +
 
 +
--[[User:Sstrader|Sstrader]] 07:27, 4 June 2008 (PDT)
 +
 
 +
Back online after several emails and phone calls. Still haven't received a message saying that they fixed the problem.
 +
 
 +
--[[User:Sstrader|Sstrader]] 20:30, 4 June 2008 (PDT)
 +
 
 +
== 2008.05 ==
 +
 
 +
JSPs being served as raw text. Put in support ticket with [[Lunarpages]].
 +
 
 +
--[[User:Sstrader|Sstrader]] 07:33, 17 May 2008 (PDT)
 +
 
 +
The issue was fixed after a couple of hours.
 +
 
 +
--[[User:Sstrader|Sstrader]] 07:21, 19 May 2008 (PDT)
 +
 
 +
== 2008.01 ==
 +
 
 +
Code releases are causing periodic errors reporting "500 Servlet Exception". This is probably the Resin error first encounter in October of last year.
 +
 
 +
--[[User:Sstrader|Sstrader]] 06:03, 24 January 2008 (PST)
 +
 
 +
== 2007.11 ==
  
 
Same cycle of down/contact host/back up. Still working on it...
 
Same cycle of down/contact host/back up. Still working on it...
Line 15: Line 203:
 
--[[User:Sstrader|Sstrader]] 11:25, 1 November 2007 (PDT)
 
--[[User:Sstrader|Sstrader]] 11:25, 1 November 2007 (PDT)
  
== 2007 October ==
+
== 2007.10 ==
  
 
For around an hour up until 3:30 PM EST today, Resin was reporting an OutOfMemory error (more infor from Caucho [http://wiki.caucho.com/Java.lang.OutOfMemoryError:_PermGen_space here]). This will generally appear when new code is uploaded, but happened unexpectedly today. Will monitor.
 
For around an hour up until 3:30 PM EST today, Resin was reporting an OutOfMemory error (more infor from Caucho [http://wiki.caucho.com/Java.lang.OutOfMemoryError:_PermGen_space here]). This will generally appear when new code is uploaded, but happened unexpectedly today. Will monitor.
Line 21: Line 209:
 
--[[User:Sstrader|Sstrader]] 12:34, 11 October 2007 (PDT)
 
--[[User:Sstrader|Sstrader]] 12:34, 11 October 2007 (PDT)
  
== 2007 September ==
+
== 2007.09 ==
  
 
Response from Lunarpages:
 
Response from Lunarpages:
Line 51: Line 239:
 
--[[User:Sstrader|Sstrader]] 07:26, 13 September 2007 (PDT)
 
--[[User:Sstrader|Sstrader]] 07:26, 13 September 2007 (PDT)
  
== 2007 June ==
+
== 2007.06 ==
  
 
Response from Lunarpages on last week's outage:
 
Response from Lunarpages on last week's outage:

Latest revision as of 21:43, 1 July 2012

2012.07

Unrelated to the leap second debacle that the Internet experienced last night, the ether server had some time zone issues today. Fixes were put into place and will be monitored.

--Sstrader 17:43, 1 July 2012 (EDT)

2011.11

All content has been moved off Lunarpages and should be more responsive. This wiki has moved to the scottdstrader.com domain. Old urls will eventually be redirected.

16:34, 21 November 2011 (EST)

2011.06

RadioWave has been moved off of Lunarpages and is back up under its old domain name.

--Sstrader 10:49, 5 June 2011 (PDT)

2011.05

After relocating the databases last month based on complaints from Lunarpages excessive database requests, we were back up for a week before they took the web sites down again because of excessive web requests. Most of those requests appear to be coming from search engines, but until I block those RadioWave will be offline.

--Sstrader 13:39, 13 May 2011 (PDT)

I'm working on getting RadioWave hosted elsewhere. It should be back up by early next week.

--Sstrader 17:55, 19 May 2011 (PDT)

I've mirrored RadioWave here temporarily.

--Sstrader 19:48, 26 May 2011 (PDT)

2011.04

Starting at sometime on 20 Apr 2011, the web hosting company took the user profile database offline because of excessive connections. I'm working with them to get it back up. Until then, visitors will not be able to log in and will see all times in GMT/UTC.

Although the issue appears to stem from the PermGen issue seen during redeployments, I made a few additional changes:

  1. Fixed an exception being thrown when invalid IDs are passed to the event permalinks.
  2. Introduced code to throttle login attempts based on frequency. This should mitigate the load from any scripts that are attempting dictionary/rapidfire attacks on the site. This technique is taken from Throttling login attempts and Dictionary Attacks 101.
  3. Blocked the three most active IP addresses: 1.202.221.2 bjtelecom.net, 66.249.68.70 googlebot, and 220.181.108.165 baidu.com.

--Sstrader 06:42, 22 April 2011 (PDT)

After daily tweaks, the sites ended up with these numbers:

04/22/1104/27/11Diff %
RadioWave5008328000.06
EtherTV598240.04
EventNett84675800.07
EtherWiki2832330.82
MySQL3.122.430.78

All hits decreased considerably except this wiki. Lunarpages only supports PHP 4.4.9, so the MediaWiki install can't be upgraded. Researching if 10 hits per hour can cause high MySQL loads.

--Sstrader 15:45, 28 April 2011 (PDT)

Databases have been relocated to avoid any issues with Lunarpages. Sites should be 90% back up with frequent monitoring over the next week.

--Sstrader 17:06, 7 May 2011 (PDT)

2010.04

Over the past few days, the server has been slow to respond and the PermGen space exception has returned. Researching.

--Sstrader 13:29, 29 April 2010 (PDT)

Put in support ticket with Lunarpages.

--Sstrader 15:08, 29 April 2010 (PDT)

Resolved. Lunarpages support replied that they didn't see anything wrong with the server and response time immediately went back to normal.

--Sstrader 17:17, 29 April 2010 (PDT)

2009.10

RadioWave is experiencing the time zone issues again. Appeared around two days ago. Opened ticket #1634058 with Lunarpages. Description:

I'm seeing some inconsistencies in how the current time is reported in my JSPs. Using this test page:
http://www.radiowavetuner.com/currenttime.jsp
I print out the current time in all time zones, with each linking to a site that reports the current time in that time zone. Some appear correct (GMT, UTC, US/Arizona, EST) but others incorrect (EST5EDT, PST8PDT). On the surface it appears related to daylight savings. This divergence in actual and expected results began in the last couple of days.
I appreciate any suggestions.

JVM version is 1.5.0_01-b08.

--Sstrader 20:08, 27 October 2009 (PDT)

Fixed by Lunarpages:

This issue occurs because JDK contains its own internal timezone data and does not rely on the system timezone data (which is already correct). I have updated the JDK timezone data and all timezones should be displaying the correct times. Please let us know if you have any questions or issues regarding this matter and we'll be happy to assist you.

--Sstrader 12:51, 28 October 2009 (PDT)

2009.07

Yesterday at around 18:00 EST, the Lunarpages servers and hosting went down, so all Ether sites went down too. They had an issue with their San Diego datacenter but were back up in a couple of hours.

--Sstrader 07:02, 15 July 2009 (PDT)

2009.05

The issue appears to have been caused by an incomplete code deployment; by midday, it was resolved. I'll keep an eye on any possible relapse.

--Sstrader 21:00, 21 May 2009 (PDT)

Updates to RadioWave's time zones processing code aren't playing well on the host's server. A fix should be available in the next day or so.

--Sstrader 21:34, 20 May 2009 (PDT)

2009.03

After the time change for daylight savings, several issues appeared in RadioWave:

  • Setting your account time zone to EST5EDT does not reflect the hour change (appears an hour behind). This is probably due to the web hosting servers missing the Timezone Updater. This is being researched.
  • Several stations chose this time to update thier schedule format and so their events are not getting imported into the RadioWave database. This should be fixed soon.

--Sstrader 07:15, 11 March 2009 (PDT)

Updates to the issues above:

  • The time zone issue resolved itself after a few weeks, so this is probably an unpatched JRE.
  • Schedule parser updated to accommodate new formats of the public broadcasting stations (KALW, WABE, WCLK, WFCR, WHQR, WXXI-FM) hosted at http://www.publicbroadcasting.net.

--Sstrader 07:12, 8 April 2009 (PDT)

2008.06

Getting NoClassDefFoundError exception on RadioWave JSP pages:

500 Servlet Exception
java.lang.NoClassDefFoundError
    at _RadioWave__jsp._jspService(/RadioWave.jsp:19)
    at com.caucho.jsp.JavaPage.service(JavaPage.java:75)
    at com.caucho.jsp.Page.subservice(Page.java:506)
    at com.caucho.server.http.FilterChainPage.doFilter(FilterChainPage.java:182)
    at com.caucho.server.http.Invocation.service(Invocation.java:315)
    at com.caucho.server.http.CacheInvocation.service(CacheInvocation.java:135)
    at com.caucho.server.http.RunnerRequest.handleRequest(RunnerRequest.java:346)
    at com.caucho.server.http.RunnerRequest.handleConnection(RunnerRequest.java:274)
    at com.caucho.server.TcpConnection.run(TcpConnection.java:139)
    at java.lang.Thread.run(Thread.java:595)

Resin 2.1.13 (built Thu Apr 1 10:57:42 PST 2004)

Working for EventNett. Different from the PermGen space exception.

Ticket #821813 opened at Lunarpages.

--Sstrader 07:27, 4 June 2008 (PDT)

Back online after several emails and phone calls. Still haven't received a message saying that they fixed the problem.

--Sstrader 20:30, 4 June 2008 (PDT)

2008.05

JSPs being served as raw text. Put in support ticket with Lunarpages.

--Sstrader 07:33, 17 May 2008 (PDT)

The issue was fixed after a couple of hours.

--Sstrader 07:21, 19 May 2008 (PDT)

2008.01

Code releases are causing periodic errors reporting "500 Servlet Exception". This is probably the Resin error first encounter in October of last year.

--Sstrader 06:03, 24 January 2008 (PST)

2007.11

Same cycle of down/contact host/back up. Still working on it...

--Sstrader 15:27, 5 November 2007 (PST)

After contacting Lunarpages, and being down most of yesterday, the server seems to be staying up today.

--Sstrader 09:16, 2 November 2007 (PDT)

There has been several periods of hour-long downtime over the past week (11:25, 1 November 2007 (PDT)). Checking with Lunarpages...

--Sstrader 11:25, 1 November 2007 (PDT)

2007.10

For around an hour up until 3:30 PM EST today, Resin was reporting an OutOfMemory error (more infor from Caucho here). This will generally appear when new code is uploaded, but happened unexpectedly today. Will monitor.

--Sstrader 12:34, 11 October 2007 (PDT)

2007.09

Response from Lunarpages:

We apologize for any inconveniences caused. Please be aware that when you reply to your ticket before we have answered it, it will be sent to the bottom of the ticket queue again and this will delay our reply.
We don't want you to be unhappy with the service but if there is any action on the server that is causing a significant degradation of services and affects all users adversely, we have to take a pro-active action. Your account was utilizing excessive MySQL resources, causing issues on the server. That's why your MySQL access was disabled without prior warning.
If you require a higher usage then allowed on shared servers you might want to try a VPS or Dedicated hosting plan.

--Sstrader 09:13, 19 September 2007 (PDT)

Back up. I had 9 extra database connections and so Lunarpages disabled my database user. Here's my second response to them after I updated the database caching code:

I've updated the code and am monitoring connections.
Secondary question: Why would you basically disable my web sites, inform me of it after the fact, and then not even reply to my requests 2 hours after the fact? I began working on this as soon as I received your email and worked quickly to resolve it. Why wasn't I returned the courtesy of a prompt response to my (simple) questions? I'm paying money and you're offering a service.
I'm not trying to be sarcastic. If for some reason I'm expecting too much, please clarify, but I really think you could have at least: given me something like a 2-hour warning (my 9 extra, inactive connection should not have degraded your server), and responded to my questions (after all, that speeds up the fix).
Thank you.

I'll post their response here when I receive it (or, if they get angry and bump me, on a new host).

--Sstrader 10:31, 13 September 2007 (PDT)

Both EventNett and RadioWave are currently having issues with database access. This should be resolved in a few hours.

--Sstrader 07:26, 13 September 2007 (PDT)

2007.06

Response from Lunarpages on last week's outage:

Our Muphrid server had experienced some issues this morning. The issues have been resolved now and all services have been restored. We apologize for the inconvenience.

--Sstrader 12:18, 25 June 2007 (PDT)

Both sites were down for a few hours today possibly due to disk space limits. No response yet from Lunarpages, but both sites are back up now.

--Sstrader 12:09, 22 June 2007 (PDT)

Redeployed EventNett and it's back up. Not sure what happened but every call to my code returned java.lang.NoClassDefFoundError as if WEB-INF/classes got dropped from the classpath.

--Sstrader 16:39, 7 June 2007 (PDT)

EventNett has been down since around 2:00 PM EST today, 7 June 2007. I've contacted the hosting service and they report no problems with the server. RadioWave is still up. I will be researching the problem this evening.

--Sstrader 13:22, 7 June 2007 (PDT)