English – Tech & Software Development

C# async/await in depth

01. C# Async/Await/Task Explained (Deep Dive). (Channel “Raw Coding”).
“In this tutorial we conver the asynchronous programming style in C# we take a deep diving looking at the state machine that the async keyword spawns. How the await keyword creates checkpoints in the state machine. And how the Task class allows us to bridge our code to use the asynchronous programming model.”

02. How to use Async/Await/Task in C#. (Channel “Raw Coding”)
“In this tutorial we take a look at how to use async, await and Task in C#. Primarily looking at good practices and how to avoid common pitfalls such as creating unecessary state machines, blocking threads, using ConfigureAwait when making libraries and how to avoid async in constructors.”

Smart Hasher – hasher with many convenient features implemented on python

Smart Hasher is an Open Source command-line tool with many convenient features.

It is available on GitHub https://github.com/sergtk/smart_hasher

I have just completed the implementation of many features I wanted to have and I didn’t find in other tools.

The story of this project

Before starting this project I needed to calculate a hash function for many large files to be confident that they are not corrupted. These files were in the cloud.

So I tried some Windows native tools, and found some non-native ones, and started to use them. Many of them support just MD5 hash which is now considered obsolete.
I would prefer the SHA-1. I also added support for some other popular hash algorithms, the full list follows: md5, sha1, sha224, sha256, sha384, sha512. All of them are supported by python out of the box by the library hashlib.

The following thing happened when I tried to calculate hash for files in the cloud. Some of the 4 Gb files were handled about 10-20 minutes, some are much longer. I could not realize what is going on. During the whole night the hash was calculated for several files only. Something similar happened in the next several days.
Sometimes tools calculated hash fastly, but not always. Speed was very unstable.
Moreover, other sites, like Youtube, were working fast. When I took my notebook to another network, everything was good with the calculation of the hashes. But again, it was difficult to understand the situation.

So I decided to write a tool to calculate the hashes. And I wanted to see progress easily and conveniently.
I didn’t found any of the hash calculation tools with such features. I could consider using some other tools for diagnostics of network speed but it was not clear what is faster: to use other tools or to write my own which allows me to change it as I want without any restrictions by feature set.

I implemented the tool with the feature to show speed for the whole current file and the last several seconds.
With this data, I already started to mail to cloud support and to call my ISP.
After all, I reached a guy from my ISP. It was quite easy to go through the call center “guards” with speed numbers. That sysadmin guy said to me that there is some strange bug in their software, that occurs on a large amount of data. So actually that guy just needed to close my session and everything worked fast again from my side.
To close the session without reaching the support he suggested me to shut down the Wi-Fi router for half an hour.
Strange story.

Another convenient feature for a large amount of data is to resume calculation after interruption.
This is implemented just by skipping calculation of the hashes for the files for which the hash is already calculated.

Another issue with existing tools is that sometimes a network connection is interrupted for a small period of time. So it is good to retry to read data from the file after a small pause. It is supported by Smart Hasher.

By default for every input data file, the one hash file is created as many other tools do. But it is not always convenient.
For example, if you have a lot of user files, it is not convenient to bloat the directory with a lot of hash files. So I implemented a feature to store all hashes in a single file. After some time, say a year, the hashes can be recalculated again. New and old files may be compared to find differences and to get an idea how user files are changed.
To simplify finding differences, file names are sorted in the hash file.
To find renames there is an option to sort the hash file by hash values because hashes for files are not changed if files are just renamed or moved.
These features are very convenient to check the integrity of our valuable data, e.g. photo archives.

To make it easy to parse file programmatically I also implemented saving data in JSON with python json library. But actually, this is more for practice with python, I didn’t use this feature yet.

There are other features that I didn’t describe here. You may find a description of them in file USAGE.md.

Continue reading “Smart Hasher – hasher with many convenient features implemented on python”

Painless update of third party libraries

Introduction

Sometimes you need to recompile several or all third party libraries in the project you develop. This may happen if you need to switch to another compiler or upgrade version of existing ones.

If you have a lot of libraries in the project you will fix error by error. Third party interfaces are changed. But you can not test everything while you complete fixing all errors.

So it would be good to update piece by piece.

Approaches to achieve update with small pieces described below. Probably you will not find something tricky, but it is good to have small tricks collected in one place. This may be especially useful when you are stuck and bored. This article may help quickly refresh approaches which may be considered to go on.

Technical dept. Circle of Death

Technical dept. Circle of Death:

Source: Josh Susser on Twitter

Key words: Quora

Using Git with SVN repository – simple case howto and benefits

Introduction

If you use Apache Subversion repository you may want to use some benefits of Git source control managment system. But you may not have possibilities to switch from SVN repository to Git completely.

One of the large benefits of Git over Subversion is possibility to commit when you don’t have internet connection. Another one is availablility of commit history when there is no internet connection as well.

You may achieve mentioned benefits with Git-SVN functionality described at Git Book chapter “Git and Other Systems – Git and Subversion”. There is also git svn man page available.

You will find below how to set up and use some Git benefits when you still continue to work with SVN centralized repository.

It is expected you have some basic knowledge about Subversion and Git before setting up communication between them.