May 21, 2021 - Wreckypedia

Wrecypedia Wrecks Database, Mapping and Charting.

This application allows users to research wrecks by searching for them, and viewing them on a map or chart.

Tech Stack Ruby on Rails 6, ruby 2.7.1, mysql, React on Rails, and also Puma / Nginx (on the server).

Integrations I’m using MapSource to provide the maps and Charts from Navionics.

Data I’ve used data from the Admiralty database and scraped some public resources.

Hosting I’m hosting this website on a Digital Ocean droplet.

Plans I’m working on SEO and how to make this website more popular and findable.

Wreckypedia is hosted here

Here is a screenshot of the site:

Aug 3, 2018 - LeJog Tracker

Land’s End to John O Groats tracker.

This app allows users to track their virtual progress from Land’s end to John O Groats on a google map.

I’m using Ruby on rails for the app, and the front end employs some React jsx for the map view and the leader board. GPX files are XML and I load the coordinates by AJAX and plot them on the map, along with the leader positions are loaded dynamically in the same way.

The tech stack is Ruby 2.4.0 Rails 5.1.6 Postgresql. Resque (for background jobs). I’m using the background jobs to collect the data so it is ready for the user to see at any time. Services running are: Resque, resque worker, resque scheduler.

The strava auth controller stores athlete authorisation tokens and athlete refresh tokens in the users table. There is a refresh tokens job that queues a refresh token jon for each user at 10 second intervals.

I use the strava webhook. This posts data to this application when a signed up user creates a run. It then uses the strava data API to populate the data for that run.

I am hosting this website on a Digital Ocean Droplet. I enjoyed setting this up from a bare bones linux distribution, e.g. installing ruby, postgres, nginx, etc and setting up my web address to point to it.

At the moment, I have a problem in that Strava deny me access to user data when the user is not connected with the API owner, I am working on this.

It is hosted here

Here is a screenshot of the site:

Aug 3, 2018 - RubyScraper

A Webscraping Application using Rails 5 and Ruby 2.4.4

This is a webscraping applicaiton for my local running club who like to compile results from Parkruns where our local club members have visited.

The scraping is done in two stages: 1) there is an index site where the list of sites where results are available is held, and then 2) the app goes to each of these sites to get the information.

I’m using ‘Mechanize’ as my scraping gem: I started using Nokogiri but I found that I could not set headers to defeat anti-scraping measures using this, and found that Mechanize just works.

I started using Postgressql because MySQL was not accepted by Heroku, however due to problems with Heroku, I started hosting this on a cloud computer and just continued using Postgres.

I initially tried Heroku to host this. However the app times out because it has a lot to do before the view can be displayed - particularly a lot of http which are costly and unavoidable. I was not able to alter the http timeout on Heroku so I went to hosting this on a Digital Ocean Droplet, which obviously allows me to alter the timeout - and lets me learn about Capistrano, nginx, and all the other stuff that things like Heroku just does for you. After a lot of development, the site loads quickly now, but uses offline jobs to achieve this. I could not get offline jobs (Resque workers) to run on Heroku, so I never went back to this.

For testing, I set up a Sinatra server (hosted in a different Git repository) to serve up test assets in development and test modes, and I’m using the ‘vcr’ gem to replay it in test mode.

When put into production, I found I was getting a connection refused when making too many requests too quickly, I therefore broke the requests up into individual jobs which fire at 10 second intervals using Resque delayed job. This has made is more reliable and of course it still collects data if one job fails. I’ve got Resque-Schedule set up to do some database administration jobs for me in the background.

The app is database heavy and takes time to do the scraping. To optimise this I’ve collected all the data in a massive hash and then saved it all in one db transaction at the end instead of individual transactions. I do various age grade and age category positions allocations on the hash before saving, this is much faster than using active record saves individually.

It is hosted here

Here is a screenshot of the site: