Documentation

Basic usage

Basic usage:

import os
from pySmartDL import SmartDL

url = "http://mirror.ufs.ac.za/7zip/9.20/7za920.zip"
dest = "C:\\Downloads\\" # or '~/Downloads/' on linux

obj = SmartDL(url, dest)
obj.start()
# [*] 0.23 Mb / 0.37 Mb @ 88.00Kb/s [##########--------] [60%, 2s left]

path = obj.get_dest()

For more examples please refer to the Code Examples page.

pySmartDL.SmartDL (main class)

class pySmartDL.SmartDL(urls, dest=None, progress_bar=True, fix_urls=True, threads=5, logger=None, connect_default_logger=False)

The main SmartDL class

Parameters:
  • urls (string or list of strings) – Download url. It is possible to pass unsafe and unicode characters. You can also pass a list of urls, and those will be used as mirrors.
  • dest (string) – Destination path. Default is %TEMP%/pySmartDL/.
  • progress_bar (bool :param fix_urls: If true, attempts to fix urls with unsafe characters. :type fix_urls: bool :param threads: Number of threads to use. :type threads: int) – If True, prints a progress bar to the stdout stream. Default is True.
  • logger (logging.Logger instance) – An optional logger.
  • connect_default_logger (bool) – If true, connects a default logger to the class.
Return type:

SmartDL instance

Note

The provided dest may be a folder or a full path name (including filename). The workflow is:

  • If the path exists, and it’s an existing folder, the file will be downloaded to there with the original filename.
  • If the past does not exist, it will create the folders, if needed, and refer to the last section of the path as the filename.
  • If you want to download to folder that does not exist at the moment, and want the module to fill in the filename, make sure the path ends with os.sep.
  • If no path is provided, %TEMP%/pySmartDL/ will be used.
add_basic_authentication(username, password)

Uses HTTP Basic Access authentication for the connection.

Parameters:
  • username (string) – Username.
  • password (string) – Password.
add_hash_verification(algorithm, hash)

Adds hash verification to the download.

If hash is not correct, will try different mirrors. If all mirrors aren’t passing hash verification, HashFailedException Exception will be raised.

Note

If downloaded file already exist on the destination, and hash matches, pySmartDL will not download it again.

Warning

The hashing algorithm must be supported on your system, as documented at hashlib documentation page.

Parameters:
  • algorithm (string) – Hashing algorithm.
  • hash (string) – Hash code.
fetch_hash_sums()

Will attempt to fetch UNIX hash sums files (SHA256SUMS, SHA1SUMS or MD5SUMS files in the same url directory).

Calls self.add_hash_verification if successful. Returns if a matching hash was found.

Return type:bool

New in 1.2.1

start(blocking=None)

Starts the download task. Will raise RuntimeError if it’s the object’s already downloading.

Warning

If you’re using the non-blocking mode, Exceptions won’t be raised. In that case, call isSuccessful() after the task is finished, to make sure the download succeeded. Call get_errors() to get the the exceptions.

Parameters:blocking (bool) – If true, calling this function will block the thread until the download finished. Default is True.
get_eta(human=False)

Get estimated time of download completion, in seconds. Returns 0 if there is no enough data to calculate the estimated time (this will happen on the approx. first 5 seconds of each download).

Parameters:human (bool) – If true, returns a human-readable formatted string. Else, returns an int type number
Return type:int/string
get_speed(human=False)

Get current transfer speed in bytes per second.

Parameters:human (bool) – If true, returns a human-readable formatted string. Else, returns an int type number
Return type:int/string
get_progress()

Returns the current progress of the download, as a float between 0 and 1.

Return type:float
get_progress_bar(length=20)

Returns the current progress of the download as a string containing a progress bar.

Note

That’s an alias for pySmartDL.utils.progress_bar(obj.get_progress()).

Parameters:length (int) – The length of the progress bar in chars. Default is 20.
Return type:string
isFinished()

Returns if the task is finished.

Return type:bool
isSuccessful()

Returns if the download is successfull. It may fail in the following scenarios:

  • Hash check is enabled and fails.
  • All mirrors are down.
  • Any local I/O problems (such as no disk space available).

Note

Call get_errors() to get the exceptions, if any.

Will raise RuntimeError if it’s called when the download task is not finished yet.

Return type:bool
get_errors()

Get errors happened while downloading.

Return type:list of Exception instances
get_status()

Returns the current status of the task. Possible values: ready, downloading, paused, combining, finished.

Return type:string
wait(raise_exceptions=False)

Blocks until the download is finished.

Parameters:raise_exceptions (bool) – If true, this function will raise exceptions. Default is False.
stop()

Stops the download.

pause()

Pauses the download.

unpause()

Continues the download.

get_dest()

Get the destination path of the downloaded file. Needed when no destination is provided to the class, and exists on a temp folder.

Return type:string
get_dl_time(human=False)

Returns how much time did the download take, in seconds. Returns -1 if the download task is not finished yet.

Parameters:human (bool) – If true, returns a human-readable formatted string. Else, returns an int type number
Return type:int/string
get_dl_size(human=False)

Get downloaded bytes counter in bytes.

Parameters:human (bool) – If true, returns a human-readable formatted string. Else, returns an int type number
Return type:int/string
get_data(binary=False, bytes=-1)

Returns the downloaded data. Will raise RuntimeError if it’s called when the download task is not finished yet.

Parameters:
  • binary (bool) – If true, will read the data as binary. Else, will read it as text.
  • bytes (int) – Number of bytes to read. Negative values will read until EOF. Default is -1.
Return type:

string

get_data_hash(algorithm)

Returns the downloaded data’s hash. Will raise RuntimeError if it’s called when the download task is not finished yet.

Parameters:algorithm (bool) – Hashing algorithm.
Return type:string

Warning

The hashing algorithm must be supported on your system, as documented at hashlib documentation page.

Exceptions

The following exceptions may be raised:

exception pySmartDL.HashFailedException

May be raised when hash check fails.

exception pySmartDL.CanceledException

Raised when user cancels the task with SmartDL.stop().

exception urllib2.HTTPError

May be raised due to problems with the servers. Read more on the official documentation.

exception urllib2.URLError

May be raised due to problems while reaching the servers. Read more on the official documentation.

exception exceptions.IOError

May be raised due to any local I/O problems (such as no disk space available). Read more on the official documentation.

Warning

If you’re using the non-blocking mode, Exceptions won’t be raised. In that case, call isSuccessful() after the task is finished, to make sure the download succeeded. Call get_errors() to get the the exceptions.

pySmartDL.utils (helper class)

The Utils class contains many functions for project-wide use.

pySmartDL.utils.combine_files(parts, dest)

Combines files.

Parameters:
  • parts (list of strings) – Source files.
  • dest (string) – Destination file.
pySmartDL.utils.url_fix(s, charset='utf-8')

Sometimes you get an URL by a user that just isn’t a real URL because it contains unsafe characters like ‘ ‘ and so on. This function can fix some of the problems in a similar way browsers handle data entered by the user:

>>> url_fix(u'http://de.wikipedia.org/wiki/Elf (Begriffsklärung)')
'http://de.wikipedia.org/wiki/Elf%20%28Begriffskl%C3%A4rung%29'
Parameters:
  • s (string) – Url address.
  • charset (string) – The target charset for the URL if the url was given as unicode string. Default is ‘utf-8’.
Return type:

string

(taken from werkzeug.utils)

pySmartDL.utils.progress_bar(progress, length=20)

Returns a textual progress bar.

>>> progress_bar(0.6)
'[##########--------]'
Parameters:
  • progress (float) – Number between 0 and 1 describes the progress.
  • length (int) – The length of the progress bar in chars. Default is 20.
Return type:

string

pySmartDL.utils.is_HTTPRange_supported(url, timeout=15)

Checks if a server allows Byte serving, using the Range HTTP request header and the Accept-Ranges and Content-Range HTTP response headers.

Parameters:
  • url (string) – Url address.
  • timeout (int) – Timeout in seconds. Default is 15.
Return type:

bool

pySmartDL.utils.get_filesize(url, timeout=15)

Fetches file’s size of a file over HTTP.

Parameters:
  • url (string) – Url address.
  • timeout (int) – Timeout in seconds. Default is 15.
Returns:

Size in bytes.

Return type:

int

pySmartDL.utils.get_random_useragent()

Returns a random popular user-agent. Taken from here, last updated on 04/01/2017.

Returns:user-agent
Return type:string
pySmartDL.utils.sizeof_human(num)

Human-readable formatting for filesizes. Taken from here.

>>> sizeof_human(175799789)
'167.7 MB'
Parameters:num (int) – Size in bytes.
Return type:string
pySmartDL.utils.time_human(duration, fmt_short=False)

Human-readable formatting for timing. Based on code from here.

>>> time_human(175799789)
'6 years, 2 weeks, 4 days, 17 hours, 16 minutes, 29 seconds'
>>> time_human(589, fmt_short=True)
'9m49s'
Parameters:
  • duration (int) – Duration in seconds.
  • fmt_short (bool) – Format as a short string (47s instead of 47 seconds)
Return type:

string

pySmartDL.utils.create_debugging_logger()

Creates a debugging logger that prints to console.

Return type:logging.Logger instance
class pySmartDL.utils.DummyLogger

A dummy logger. You can call debug(), warning(), etc on this object, and nothing will happen.

class pySmartDL.utils.ManagedThreadPoolExecutor(max_workers)

Managed Thread Pool Executor. A subclass of ThreadPoolExecutor.