easy_german package¶
Submodules¶
easy_german.drive module¶
Small wrapper for the Python Google Drive API client library.
https://developers.google.com/api-client-library/python/apis/drive/v2
-
easy_german.drive.create_folder(service, name)¶ Create folder in Google Drive.
Parameters: - service – Google Drive API instance.
- name (str) – Name of folder to create.
Returns: Google Drive API response.
Return type: dict
-
easy_german.drive.delete_file(service, file_id)¶ Delete file in Google Drive.
Parameters: - service – Google Drive API instance.
- file_id (str) – Id of file to delete.
Returns: Google Drive API response.
Return type: dict
-
easy_german.drive.get_credentials()¶ Get valid user credentials from storage.
If nothing has been stored, or if the stored credentials are invalid, the OAuth2 flow is completed to obtain the new credentials.
Returns: Credentials, the obtained credentials. Return type: An instance of a Google Drive API class.
-
easy_german.drive.upload_media(service, path, mime_type, parents=None, resumable=True)¶ Upload file to Google Drive.
Parameters: - service – Google Drive API instance.
- path (str) – Path of file to upload.
- mime_type (str) – MIME type of file to upload.
- parents (list) – Ids of folders to upload file to; file uploaded to root folder by default.
- resumable (bool) – Can file uploading be resumed.
Returns: Google Drive API response.
Return type: dict
easy_german.transcripts module¶
Scrape video transcripts and upload to Google Drive.
-
easy_german.transcripts.main()¶ Scrape transcripts and upload to Google Drive.
easy_german.utils module¶
Utility functions.
-
easy_german.utils.clear_local_media(tmp_dir)¶ Remove directory tree.
Parameters: tmp_dir (str) – Path of tree root directory to remove.
-
easy_german.utils.extract_episode(text, seg_search, eg_search)¶ Extract episode number from metadata.
Parameters: - text (str) – Metadata containing episode number.
- seg_search (str) – Regex for a Super Easy German episode.
- eg_search (str) – Regex for an Easy German episode.
Returns: Episode number and type.
Return type: dict
-
easy_german.utils.get_gdrive_service()¶ Attempt to authenticate and get Google Drive API service.
Returns: Google Drive API service. Return type: An instance of a Google Drive API class.
easy_german.videos module¶
Scrape video audio and upload to Google Drive.
-
easy_german.videos.main(max_downloads, max_results_per_page)¶ Scrape audio and upload to Google Drive.
Parameters: - max_downloads (int) – Maximum number of video audios to scrape and download.
- max_results_per_page (int) – Maximum number in each batch; less than or equal to 50 as per YouTube API pagination.
-
easy_german.videos.process_items(gdrive_service, items, videos_downloaded, max_downloads)¶ Process YouTube playlist items.
Parameters: - gdrive_service – Google Drive API service.
- items (list) – YouTube playlist items to process.
- videos_downloaded (int) – Total number of video audios downloaded so far.
- max_downloads (int) – Maximum number of video audios to download.
Returns: Number of video audios downloaded in the current batch.
Return type: int
Module contents¶
Download audio of Easy German videos and upload to Google Drive.
-
easy_german.get_transcripts()¶ Scrape video transcripts and upload to Google Drive.
-
easy_german.get_videos(max_downloads=1, max_results_per_page=10)¶ Scrape video audio and upload to Google Drive.
Parameters: - max_downloads (int) – Maximum number of video audios to scrape and download.
- max_results_per_page (int) – Maximum number in each batch; less than or equal to 50 as per YouTube API pagination.