Skip to main content

Day 23 Big Data on Hadoop Udemy | Arduino

So my wife has secured a job in Luxembourg and I need to board along. I looked at how I could maximise my chances. Machine Learning has really good opportunities but I need time probably atleast 6 months. 3 things always crept up my mind Data Science, AI/ML and Big Data.
I foound Big Data to be interesting and more achievable in 3 months target as I'm familiar with most technologies and needed to upgrade myself.

I had a choice of choosing between Microsoft Program in Big Data vs Udemy Big Data.
I chose Udemy as it's slightly lesser on schedule and more precise on content, also the pricing is very attractive, which should have been a similar case for Microsoft but seems like 1000$ for a certificate which would teach similar skills just on Microsoft stack seemed little steep.

So I've bought the Udemy - Ultimate Hadns on Big Data. Another exciting thing for me was the trainer is an Amazon and IMDB ex employee and I actually quite like their UX and the way one could search and similar functionality.

To give a background on the Arduino , I have completed the ultrasonic module , what I'm trying to figure out is controlling the speed of the motor along with the servo motor. Will update more.

Starting with Big Data - 1st thing I came across is installing a Virtual Box which I was fundamentally missing as we never get to switch on-to Linux working with Windows Servers.
So this is a good opportunity to refresh command line.

https://www.virtualbox.org/wiki/VirtualBox

Image for Hadoop
https://hortonworks.com/downloads/#data-platform
https://hortonworks.com/tutorial/hadoop-tutorial-getting-started-with-hdp/

The image  is 11gb and looks like the image will have most of things installed on it.

Using Hypervisor on windows 10.
https://blog.zeddba.com/2017/09/25/disabling-microsofts-hyper-v-to-use-oracles-virtualbox/
https://support.microsoft.com/en-ie/help/3204980/virtualization-applications-do-not-work-together-with-hyper-v-device-g

Online Hadoop Cluster
http://127.0.0.1:8088/cluster
Hadoop
http://127.0.0.1:8888/

DataSet
https://www.kaggle.com/prajitdatta/movielens-100k-dataset

username and password for Hadoop
maria_dev
maria_dev

Comments

Popular Reads

Day 15 GCP Recommendations , Cloud SQL PySpark DataProc

Collaborative Filtering - RDD-based API https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html PySpark https://spark.apache.org/docs/0.9.0/python-programming-guide.html Managed Hadoop & Spark https://cloud.google.com/dataproc/ Fully-Managed PostgreSQL  BETA  & MySQL https://cloud.google.com/sql/ Cloud sql can run 1 petabit per second

Day 18 GCP DataLab, Big Query from Client Side , Pandas- Python

'sudo su -' vs 'sudo -i' vs 'sudo /bin/bash' - when does it matter which is used, or does it matter at all? https://askubuntu.com/questions/376199/sudo-su-vs-sudo-i-vs-sudo-bin-bash-when-does-it-matter-which-is-used docker ps  will show only running containers by default. To see all containers:  docker ps -a https://docs.docker.com/v1.11/engine/reference/commandline/ps/ https://8081-dot-2337103-dot-devshell.appspot.com/tree/datalab root1234 - paraphrase DataLab gives the ability to share a notebook with other people , at the same time use the cloud for computing n storage. https://cloud.google.com/bigquery/docs/reference/libraries#client-libraries-usage-python https://github.com/google/google-api-javascript-client http://stackoverflow.com/questions/12479895/obtaining-bigquery-data-from-javascript-code Python Data Analysis Library http://pandas.pydata.org/

Day 13 - Microsoft artificial Intelligence | Arduino Robotic Kit

Using Visual Studio for python -- back to debugging mode , feels amazing. So finally after managing to get 4 18650 batteries , 2 of them are discharged , now not sure how to recharge them. Good Resources for AI https://blog.goodaudience.com/learn-ai-for-free-5b186cde3990 Python Math Operators https://www.digitalocean.com/community/tutorials/how-to-do-math-in-python-3-with-operators Reading an Input form cmd line with python https://stackoverflow.com/questions/70797/user-input-and-command-line-arguments Convert string to int in Python https://guide.freecodecamp.org/python/how-to-convert-strings-into-integers-in-python/ https://www.programiz.com/python-programming/if-elif-else

Day 20 Google APIs, Google Application Default Credentials

Searching for objects attribute value, it has to be Datastore . Remember that BigTable, you can only search by key.  High-throughput writes of wide-column data. Well, that is BigTable , right, because it's supporting high-throughput writes.  Warehousing structured data. So what's the data warehouse technology on Google Cloud? That's, which one, BigQuery .  To create and test new machine learning methods. Well, if you're writing new machine learning methods, then TensorFlow .  Develop Big Data algorithms interactively in Python.Well, interactive development in Python is done best with Datalab .   Well, interactive No-ops, custom machine learning applications at scale. No-ops ML at scale, then that's a role for Cloud ML.  Automatically reject inappropriate image content. Rejecting image content where it is inappropriate. Well, that could be the Vision API. So you could use a Vision API to basically see if this is safe content or not safe content. 

Day 7 Hands on with Git Repo and RStudio , R 101 BigDataUniversity

Removing a remote http://stackoverflow.com/questions/9224754/how-to-remove-origin-from-git-repository Kickstarting   R  - Writing R scripts https://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_scrpt.html Source on Save https://support.rstudio.com/hc/en-us/articles/200484448-Editing-and-Executing-Code Ctrl+L  — Clear the Console https://support.rstudio.com/hc/en-us/articles/200404846-Working-in-the-Console User Defined Functions in R http://www.statmethods.net/management/userfunctions.html Issue pushing new code in Github http://stackoverflow.com/questions/20939648/issue-pushing-new-code-in-github Git refusing to merge unrelated histories http://stackoverflow.com/questions/37937984/git-refusing-to-merge-unrelated-histories

Day 8 - Microsoft AI | Arduino Robotic Kit

Arduino Robotic Kit https://photos.app.goo.gl/Ng6MmaRnxHHWUHpJ6 Just got my Arduino robotic kit , quite excited :-) , Let us start exploring the various components. HC SR 04 Ultrasonic Ranging Module https://www.mouser.com/ds/2/813/HCSR04-1022824.pdf HC - SR04 provides 2cm - 400cm non-contact measurement function, the ranging accuracy can reach to 3mm Vcc , Trigger , Echo n Ground  You only need to supply a short 10uS pulse to the trigger input to start the ranging, and then the module will send out an 8 cycle burst of ultrasound at 40 kHz and raise its echo. The Echo is a distance object that is pulse width and the range in proportion https://randomnerdtutorials.com/complete-guide-for-ultrasonic-sensor-hc-sr04/ https://www.arduino.cc/en/Reference/Libraries https://www.youtube.com/watch?v=6F1B_N6LuKw

Day 5 IBM Watson , DataCamp Purchase, IT Pros Attack n RedMonk rankings

IBM Watson http://www.jenunderwood.com/2017/03/28/ibm-watson-cognitive-computing/?utm_content=bufferb4ded&utm_medium=social&utm_source=facebook.com&utm_campaign=buffer The DataCamp Intro to R is free but later courses are paid. Will try complete Coursera first and get some heads on and then I can buy DataCamp courses as well. https://www.tripwire.com/state-of-security/featured/90-pros-expect-attacks-risk-vulnerability-iiot-2017/ http://redmonk.com/sogrady/2017/03/17/language-rankings-1-17/

Day 14 Swirl

install.packages("swirl") https://github.com/swirldev/swirl_courses#swirl-courses https://en.wikipedia.org/wiki/YAML http://yaml.org/ | You can exit swirl and return to the R prompt (>) at any time by pressing the Esc key. If you are | already at the prompt, type bye() to exit and save your progress. When you exit properly, you'll see a | short message letting you know you've done so. | When you are at the R prompt (>): | -- Typing skip() allows you to skip the current question. | -- Typing play() lets you experiment with R on your own; swirl will ignore what you do... | -- UNTIL you type nxt() which will regain swirl's attention. | -- Typing bye() causes swirl to exit. Your progress will be saved. | -- Typing main() returns you to swirl's main menu. | -- Typing info() displays these options again. | Let's get started! sqrt() function and to take the absolute value, use the abs() function Vector of unequal length Artihmetic Op

Helplessness

So we had no project and to add the to misery no jobs.We were told by our placement dept tht our dept so called MEDICAL ELECTRONICS wasn't allowed by most of the companies. We all wer dejected as all of us had worked hard and wanted to get some job.I tht why sit n crib all time when u have so much time to enjoy. So i hit wit my family to Rajasthan.It was a nice but extremely hectic trip for me. We went by flight to jaipur then next day jodhpur by bus the same day udaipur by bus ,whr i came to know tht companies wer allowing us n the placement dept scammed us. So next day we left frm udaipur to mumbai  by train n then the same day to bangalore by bus. In 5 days i had spanned across 5 states. I rechd bangalore and had accenture next day so went thro the placement bible RS Aggarwal. Sadly i didn't clear then i sat for HCL which again i didn't clear so i was amidst doubt within myself. But still i didn't lose hope n worked harder.

Day 25 R Programming

https://stackoverflow.com/questions/18222286/dynamically-select-data-frame-columns-using-and-a-vector-of-column-names https://stackoverflow.com/questions/12614953/how-to-create-a-numeric-vector-of-zero-length-in-r https://stackoverflow.com/questions/7355187/error-in-if-while-condition-missing-value-where-true-false-needed http://www.dummies.com/programming/r/how-to-add-observations-to-a-data-frame-in-r/ https://stackoverflow.com/questions/11561856/add-new-row-to-dataframe-at-specific-row-index-not-appended https://stackoverflow.com/questions/22235809/append-value-to-empty-vector-in-r