Wednesday, February 17, 2016

Our Daily Scrape, a recipe

Our Daily Scrape, a recipe

Step 1: Take a snapshot of the database; archive it

See the excellent article, written by the all-around good guy, here (url:

Step 2: Scrape

Scrape – described at the bottom of the Trading Analytics tab – is an application that scrapes the top5s securities lists from google finance and saves the results in two places: a graph database (configuration in the environment) and a semi-structured matrix of historical top5s

The methodology of scrape is this. After the trading day, at 6 pm-ish, you execute scrape:
geophf:writing geophf$ scrape
And it does its thing (see below under 'enscrape').

Now, if you, like I often do, sleep on your keyboard and wake up after midnight do not run scrape! But instead run enscrape with the previous day's date as the argument: the top5s are for yesterday, not today, so we wish to enter those data for that day:
geophf:writing geophf$ enscrape 2016-02-16
######################################################################## 100.0%
Saved to /Users/geophf/Documents/OneDrive/work/1HaskellADay/Seer/sources/google/2016-02-16-index.html ...

Updated /Users/geophf/Documents/OneDrive/work/1HaskellADay/Seer/data/top5s.csv with 2016-02-16 data
HTTP/1.1 100 Continue

HTTP/1.1 200 OK
Server: nginx
Date: Wed, 17 Feb 2016 09:18:38 GMT
Content-Type: application/json
Content-Length: 125
Connection: keep-alive
Access-Control-Allow-Origin: *


Saved 2016-02-16 top 5s to GrapheneDB

The 'thing' is this: you must run scrape/enscrape before the markets open the next trading day. As soon as the markets open the top5s for the previous day go away and start to fluctuate with the market, minute-to-minute. 

Scrape after 6 pm, enscrape before 9 am (hopefully before 8 am) and that your window; don't screw up the data by violating that window.

Fer realz, yo.

Step 3: Capture Top5s Data in the Daily Report

So, you have your daily reports – e.g.: – divided into two parts: the reportage and the analysis. The reportage is this:

Copy and paste the top5s for the day into the report. That is, from the above scrape/enscrape run-off, copy:
Updated /Users/geophf/Documents/OneDrive/work/1HaskellADay/Seer/data/top5s.csv with 2016-02-16 data
... into the daily report.

Open up your graph database and get a screen shot of today's top5s with the previous day's top5s categories expanded to see interesting day-to-day trends (precursor to analysis), e.g.:

Note in the above screen shot that before I did scrape, I exported a snapshot of the database (then shunted that export off to my company's google drive).

Step 4: Analysis

Note that went you expand and tease apart the graph of today's top5s, Some Stocks Start to Stand out Stupendously (I call it the S5-effect) (I just invented that, actually). Pick the one that's of interest to you.

interest, n.: 1. what you don't get on your money in a savings account anymore
2. whatever is of interest to you, see: 'interest.'

That's very ... 'helpful'! NOT! 

So, to help in a substantive way, I am developing tools to automate the 'feelz' for what's interesting – particularly the Repeatinator2000! and the new, improved GAPINATOR3004!! – but these are very much alpha-stage tools at present, so you have to come up with or develop with practice your own feel for what looks interesting to you for now.

You know: make your own decisions, ... liek: on your own, liek.

Today, I picked out LMCB as it has multiple connections, and I hadn't seen in before.

When you pick a stock, run it through analyze, a program described at the bottom of the Trading Analytics tab:
geophf:writing geophf$ analyze LMCB
analyze: Ratio has zero denominator

Okay, whoopsie! This does happen (like twice in the last nine months), and it happens on johnny-come-latelies to the top5s list, that is, possibly, newly minted billion+-dollar companies that don't have 3 months of trading history.

Possible? Maybe? Maybe they just went public, or maybe they changed stock symbols and their old trading information doesn't carry forward?

I don't know. I don't care. I just move on and pick a different stock to analyze.

So, missed opportunities here? Perhaps. And to convince me of that, write a white paper on how I am missing out big-time on these rare opportunities.


So, let's regroup. I just picked BAC because it had some good things going, but, in retrospect (i.e.: if after I had slept on this, perhaps), maybe LPLA would've been an interesting case for analysis.

The Markets: so many interesting case studies! So little time!
geophf:writing geophf$ analyze BAC
Wrote analysis files for BAC
The analyze tool spits out three CSV files: BAC-EMAS.csv, BAC-kds.csv, and BAC-SMAS.csv. Since these are comma-separated value files, you can easily load them into a data visualization tool of your choice (e.g.: Excel for you, maybe. Me? I use Numbers, because I'm not stupid: I own a Mac), then take screen shots of your analytics results. You can read up on what SMAs, EMAs and the %K vs. the %D lines of Stochastic Oscillators mean on investopedia, e.g.: here's the write up on SMAS: 

With this completed report, I blog it (sample link at the top of this recipe article), and also tweet the graph and three charts on our company's @logicalgraphs twitter account.


No comments:

Post a Comment