Auto Added by WPeMatico

Web crawling and scraping in Python

Processing the webpageIn this article we will learn following thingsBasic crawling setup In PythonBasic crawling with AsyncIOScraper Util servicePython scraping via Scrapy frameworkWeb CrawlerA web crawler is an internet bot that systematically browses world wide web for the purpose of extracting useful information.Web ScrapingExtracting useful information from a webpage is termed as web scraping.Basic Crawler demoWe have been using following toolsPython request (https://pypi.org/project/requests/) module used to make a crawler botParsel(https://parsel.readthedocs.io/en/latest/usage.html) library used as a scraping toolTask IScrap Recurship(http://recurship.com) website ,extract all links…

Continue Reading

Why you should use Django for your next project

Python being one of the moused and easy programming languages to learn in 2018 and Django being the most used python web framework. As suggested by the creators, Django is a web framework for perfectionists with deadlines.Django is a robust web framework that comes with a lot of built-in functions and readymade modules to fit into most web applications use cases. The following is a list of goodies that comes packaged with Django.Database ORM: The…

Continue Reading

Educational Products 2018.3: More Kotlin and Python Learning, Better UI and Performance

With the 2018.3 release, Educational Products bring more learning opportunities for anyone interested in Kotlin and Python. The updated version also enhances the performance, user interface, and user experience of our IDEs to help learners focus on learning and teachers on teaching. Download Learning KotlinLearning PythonBetter User Experience Learning Kotlin The Kotlin Koans course, available within Educational Products, is one of the most popular ways of getting familiar with the Kotlin syntax. Now we’re happy…

Continue Reading

Auto incrementing IDs for MongoDB

If you’re familiar with relational databases like MySQL or PostgreSQL, you’re probably also familiar with auto incrementing IDs. You select a primary key for a table and make it auto incrementing. Every row you insert afterwards, each of them gets a new ID, automatically incremented from the last one. We don’t have to keep track of what number comes next or ensure the atomic nature of this operation (what happens if two different client…

Continue Reading

How I automated the boring University stuff with Python

Hello!My college has a general student — login, where students can view their profile, upload assignments, get due dates, download course materials, and stuff.But the site is kind of tedious to navigate through, and hence I decided to use python to automate the boring stuff. One of them is the Assignment Reminder Service.As for this article, you need to know a bit about how the web request — response model works and simple python knowledge. It will be split into…

Continue Reading

How I developed a captcha cracker for my University’s website

Hello!Consider this a spinoff of my original article. I had some requests from the readers to explain how I developed the captcha cracker, and hence I decided to share the story of my first (significant?) project with you guys. Repository LinkLet’s start!When I developed these set of scripts, I had zero knowledge of Image Processing or the algorithms used in it. It was in my fresher year that I worked on this.The basic ideas I…

Continue Reading

Concurrent Programming in Python is not what you think it is.

In this article, I will first walk you through the distinction between concurrent programming and parallel execution, discuss about Python built-ins concurrent programming mechanisms and the pitfalls of multi-threading in Python.Concurrent programming is not equivalent to parallel execution, despite the fact that these two terms are often being used interchangeably.Illustration of concurrency without parallelismConcurrency is a property which more than one operation can be run simultaneously but it doesn’t mean it will be. (Imagine if…

Continue Reading

A Simple Guide to Becoming a Web Developer

Recently you may have decided to learn Web Development. Where do you start? What do you need to know? What resources are out there to help you get started? Where do I find others like me and people more experienced than me that will lend a helping hand? Luckily for you, I’ve been down this road a couple different times in different ways. When I first started writing software, I wanted to be a…

Continue Reading

Could Python’s Popularity Outperform JavaScript in the Next Five Years?

JavaScript and Python are two influential programming languages for building a wide range of applications.While JavaScript has been the dominant programming language for many years, Python’s fast-growth threatens to dethrone the widely popular technology.Melight, who has more than ten years of software development experience and currently teaches people his skills, says that “with the recent developments in the technology space, we are likely to see a neck and neck popularity competition between JavaScript vs.…

Continue Reading

How to get rid of loops and use window functions, in Pandas or Spark SQL

source: https://pixabay.com/en/stained-glass-spiral-circle-pattern-1181864/Every software developer knows that iterating through rows of a dataset is one sure killer of performance. Loops are bad. Vectorized operations (operations that work on entire arrays) are good. Pandas, the Python library for data analysis, (https://pandas.pydata.org/), has vectorized operations for everything, allowing for great performance. For more on this topic, see an excellent article by Sofia Heisler here: https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6.So, we can add a new calculated column to a Pandas dataframe, in…

Continue Reading

Reducing Memory Footprint While Creating Archive in Django

Python built-in zip library is commonly used to create archive. However, there is a concern when creating zip files using built-in library. Consider the case which we are zipping files larger than our available memory, we would easily run out of memory.I was building a feature that requires zipping of files and upload to our Django backend storage. After digging around the internet, I summarize the logic I used to support this feature.Use NamedTemporaryFile Instead Of MemoryA…

Continue Reading

Wiring Communication Between Microservices

Choosing a mean to connect microservices is never an easy task, many factors are taken into account before resorting to an option. If you are building a production-ready system, I guess the principle of weighing all factors hold true. Yes, I know this doesn’t apply to visionaries :)In this article, I will run through some common communication means, briefly describe the background of our project, and my arguments on choosing RPC over the remaining options.Before…

Continue Reading

Simple Image Steganography in Python

A brief introduction to the art and science of data hidingPhoto by Markus Spiske on Unsplash… and yes, there’s something hiddenIn this post I’ll demonstrate how to achieve simple image steganography using Python. All digital file formats use internal structures and schemas, therefore unique implementations are required for different mediums, and often for different formats within those mediums.What is SteganographySteganography is the art and science of concealing a message or file within a different, typically unrelated…

Continue Reading

Writing a Basic Keylogger for macOS in Python

A brief look at how to covertly log user activity on macOSPhoto by Christin Hume on Unsplash⚠️This post is for educational purposes only⚠️A keylogger is probably one of the last things you want on your computer. Unfortunately, this type of program is usually well hidden and often go completely undetected by the victim.At its core, a keylogger is a device or program that logs everything you type on your computer. Meaning that every password, every private message you…

Continue Reading

Get A Quick Start With PySpark And Spark-Submit

We just released a new open source boilerplate template to help you (any Spark user) run spark-submit commands smoothly — such as inserting dependencies, project source code and more.TLDR: Here is an open source template to help you get startedAt Soluto, as part of Data Scientist day-to-day work, we create ETL (Extract, Transform, Load) jobs. Our main tool for this is Spark, specifically, PySpark, with spark-submit.Spark is used for distributed computing on large-scale datasets. spark-submit helps you launch…

Continue Reading

Automatic backup of git repositories to Dropbox with Python

Originally Published on: 01.12.2017IntroI will show how to upload files to Dropbox from Python code.Why do I need this?Currently, I am only using WebFaction for all my web services and also as my private git server.I wanted to make an automatic backup of my git repositories to Dropbox.Dropbox AppI order to upload files to Dropbox you need to have an access token.And for the access token, you need to register your app on DBX platform.All of this must be…

Continue Reading

S3 trickery, using it as a scheduler

“March calendar” by Charles Deluvio 🇵🇭🇨🇦 on UnsplashOne of the fun parts in using serverless is the fact that you can try out new ideas and provision them in a flick of a finger. I’ve mentioned more than once that s3 is a powerful tool that can be used as more than an elastic persistent layer.S3 the best of 2 worldsIn this post, I’m going to demonstrate how to use s3 as a scheduling mechanism…

Continue Reading

Top 10 Libraries in Python to Implement Machine Learning

Nowadays, Python is one of the most popular and widely used programming languages and has replaced many programming languages in the industry. There are various number of reasons why Python is popular among developers and one of them is that it has a large collection of libraries. According to builtwith.com, 45% of technology companies prefer to use Python for implementing AI and machine learning.Some of the important reasons why Python is popular:From developing to deploying…

Continue Reading

Fundamental Python Data Science Libraries: A Cheatsheet (Part 4/4)

If you are a developer and want to integrate data manipulation or science into your product or starting your journey in data science, here are the Python libraries you need to know.NumPyPandasMatplotlibScikit-LearnThe goal of this series is to provide introductions, highlights, and demonstrations of how to use the must-have libraries so you can pick what to explore more in depth.Scikit-LearnScikit-Learn is built on top of NumPy, SciPy, and matplotlib. It contains an extensive collection of ready-to-use…

Continue Reading

Python comes to rescue again — Electives Allocation

This is a python program for automating — Electives Allocation , based on the time students filled a Google Form. If you want to see the code directly, skip the Background .Background : I was ready to go to college for one last semester. Since the pressure of getting a job was already over , I was looking forward to enjoy the last leg of my college life.My college NIT Trichy offers three electives for the last term. After 3.5 years…

Continue Reading
Close Menu
Skip to toolbar