Categories
powershell youtube youtube-dl

PowerShell: Download Movies from YouTube with Invoke-Webrequest and youtube-dl

Disclaimer: I have no knowledge of the copyright status (public domain or otherwise) of individual movies uploaded to YouTube. Use good judgement.

Did you know that people apparently upload entire movies to YouTube? In fact, there’s a subreddit dedicated to finding these movies. I’d rather have the movie files themselves rather than be reliant on a browser or app to watch them, so I wrote some quick PowerShell code to get these files.

Before we break this down – if you’ve come here thinking about downloading an entire YouTube channel, or an entire playlist, you’d be better served by reading youtube-dl’s native functionality. It is a very powerful app and you can probably do what you want with just the correct parameters, and not involve PowerShell at all.

There are a couple pieces here – one, we’ve got a  list of these movies at /r/fullmoviesonyoutube/,  and we need to scrape the YouTube links. Once we have those links, we need to download the movie files, which is where youtube-dl comes in. youtube-dl is a really powerful command line executable that downloads YouTube video files. Get the links, pipe them to youtube-dl, boom, lots of movies.

If you’re new to web scraping, PowerShell’s invoke-webrequest is a great place to start. Below, we’re using it to extract all the links from the starting page (the subreddit home), then checking to see if there is a “Next” button on the page. If there is, we need to navigate to the next page and extract those links as well. We need to continue that process until there is no “Next” button – meaning the end has been reached.

$youtubelinks = @()
#setting $nextbutton to a non-null value, which is what our loop is going to check for pagination.
$nextbutton = $true
#setting where we're going to start looking for youtube links.
$url = "https://www.reddit.com/r/fullmoviesonyoutube/"

#So each time at the end of the loop, we're going to check if there is a link with the text "next >"
#if there is such a link, we're going to invoke-webrequest the href of that link, and do it all again.
#When there are no more links, $nextbutton will return $null, and the loop will end.
while ($nextbutton -ne $null)
    {
    $alllinks = (Invoke-WebRequest $url).links
    $youtubelinks += $alllinks | where-object {$_.class.contains("title may-blank outbound")} 
    $nextbutton = $alllinks | where-object {$_.innertext.contains("next ›")}
    $url = $nextbutton.href
    }
#We're piping the results of our scraping to a text file.  
$youtubelinks.href | Out-File youtubemovies.txt 

There is an error that triggers each time I search for the “next” link. It still works, so I guess I don’t care for now. This code doesn’t check if the videos exist, or confirm anything about them. It just sends any link with a css class that contains “title may-blank outbound” (these are specifically the reddit item links) to the $youtubelinks object. youtube-dl can manage everything else. By writing the links to a file, we can use it to test the second part. If you’d just like to jump into downloading, the text file I scraped can be downloaded here.

You need to install youtube-dl, and the best way to do it is to install Chocolatey. Start a PowerShell session (“Run as Administrator”) and run the following command:

iwr https://chocolatey.org/install.ps1 -UseBasicParsing | iex

After it completes (and remember, it won’t work if you don’t run PowerShell as Administrator), run the following code:

choco install youtube-dl

That will do it – if you’re familiar with Debian-derivatives, chocolatey is just like apt-get. Here’s what the second part of the script should look like:

#loading the file into $youtubemovies
$youtubemovies = Get-Content youtubemovies.txt

#For loop to send each line in the file to youtube-dl
ForEach($youtubemovie in $youtubemovies)
    {
    youtube-dl -o 'E:/Youtube/YouTubemovies/%(title)s.%(ext)s' $youtubemovie
    }    

In the second part here, we’re using the default youtube-dl settings and only specifying where we want the file to be saved. If you do not give a path, it’ll use the current working directory (which you probably don’t want). I’ve got an external drive (E:) so that’s what I’m using here. You’ll also notice that I’m using some wildcards for the naming – you could choose to get more descriptive. The important part to notice is that we’re giving youtube-dl the next url on each subsequent loop. It’ll do the rest.

I left the second portion of our script running for the better part of a day and it was already over 250GB of files, so be warned if you’re on a metered connection.

Update: It’s finished – 556 movies (well, files anyway), 273GB total.

Categories
IE powershell

PowerShell: Accessing Data in IFRAME in IE Com Object

A lot of my work involves using PowerShell and the IE com object to extract data. We’ve got a VOIP system at work that is really inflexible and doesn’t have any “critical level” functionality – ex: supervisors need to be notified if the call center’s current on-hold levels are extreme. The problem has been that the intranet portal with this information is javascript-crazy stacked on top of multiple IFRAMES and all sorts of nesting. Getting access to a specific IFRAME with all of that going on took a lot of patience, but I want to share it in case anybody else runs into this (or I forget how I accessed it).

This gets you the source html of the iframe in string format. You can then do all the string manipulation necessary to get your work done.

$html = $ie.Document.IHTMLDocument3_getElementById("The_IFRAME_ID").contentWindow.IHTMLWindow2_document.body.IHTMLElement_outerHTML
Categories
microsoft word powershell

PowerShell: Automating Search and Replace in Microsoft Word

I’m required to send various letters and internal memos on a schedule. I hate taking the boilerplate and going through and manually updating it to reflect the current period’s numbers, but there really aren’t  many options. The folks receiving the communications expect it in a specific format, and they don’t like surprises. This means I’m using the same Word documents that have been used for 10+ years. I’ve tried to recreate them in Access or Excel and it never quite looks right. I found the solution in PowerShell here.

The idea is to take the boilerplate Word document and create a template where all of the variable elements have a unique name to be searched and replaced (ex: the date at the top of the letter reads CURRENTDATE). We then use a function, based on that link to the Microsoft explanation, to switch each variable element to what we want. So, for CURRENTDATE, we’d replace it with (get-date).tostring(‘MMMM dd, yyyy’) to get February 13, 2017.

Anyway, I’m going to change my processes to use this all over the place.

#function that does the search and replace.
function replaceword ($FindText, $ReplaceText)
    {
    $ReplaceAll = 2
    $FindContinue = 1
    $MatchCase = $False
    $MatchWholeWord = $True
    $MatchWildcards = $False
    $MatchSoundsLike = $False
    $MatchAllWordForms = $False
    $Forward = $True
    $Wrap = $FindContinue
    $Format = $False

    $objSelection.Find.Execute($FindText,$MatchCase,
      $MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
      $MatchAllWordForms,$Forward,$Wrap,$Format,
      $ReplaceText,$ReplaceAll) | out-null
    }

#creating the word com object
$objWord = New-Object -ComObject word.application
$objWord.Visible = $True
$objDoc = $objWord.Documents.Open("Payment.doc")
$objSelection = $objWord.Selection

#function call
replaceword "CURRENTDATE" (get-date).tostring('MMMM dd, yyyy')
Categories
IFTTT powershell

PowerShell: Trigger IFTTT Maker Channel using PowerShell v2

I’m in an environment where only PowerShell version 2 is available, which limits functionality and forces me into workarounds. I’ve been interested in using the Maker Channel on IFTTT in PowerShell for some time, but everybody seems to be using Invoke-RestMethod, which is unavailable in PowerShell v2. The following function uses the .NET object System.Net.WebClient to accomplish the trigger in the older version.

You have to create your trigger first on the IFTTT website (or mobile app) and include the three variables (or less, whatever), and link the associated action – I chose the “notification” option, which brings the trigger to my attention on my phone. This, of course, is incredibly versatile – you could set it up to notify somebody else, via text message, phone call, email, or whatever. IFTTT has an awesome collection of ways to act on the trigger. Check it out.

Note: There seems to be a problem with including spaces in the values sent. I’m just not including them, working fine now.

#EXAMPLE USAGE: iftttnotify "work_info" "reports" "accounting" "completed"
function iftttnotify ([string]$triggername,[string]$value1, [string]$value2, [string]$value3)    
    {
    $privatekey ="YOUR_PRIVATE_KEY"
    $URL="https://maker.ifttt.com/trigger/$triggername/with/key/$privatekey"
    $NVC = New-Object System.Collections.Specialized.NameValueCollection
    $NVC.Add("value1","$value1");
    $NVC.Add("value2","$value2");
    $NVC.Add("value3","$value3");
    $WC = New-Object System.Net.WebClient
    $WC.UseDefaultCredentials = $true
    $Result = $WC.UploadValues($URL,"POST", $NVC);
    $WC.Dispose();
    }

Categories
Uncategorized

Python: Automating Analog Video Capture with pyautogui

Early in December I was talking to a co-worker about how expensive tape-to-digital conversion was. I was charged over $60 locally for less than a couple hours of tape – and when I saw their setup, it was an Elgato Video Capture device. I put it on my Amazon wishlist, and it went on sale (almost immediately after I had the aforementioned conversation) for $50-ish. I bought it, and I’ve been impressed with the results, so I volunteered to do a bunch of conversions (8MM) for my parents.

The end product is good, but the software managing the capture is a little bit clunky and doesn’t “sense” when a tape hits dead air, which means I have to babysit the process or let it take a long time and then trim it back down to size. The process goes like this:

  1. Give your project (video) a name.
  2. Click continue a couple times confirming that the device is receiving video and audio.
  3. Click a big red button to “Start Recording”.
  4. When you hit the point on the tape where there is no more content, you click “Stop Recording”.
  5. You are then given the opportunity to trim the beginning and end of the capture, and click continue.
  6. “Please wait while the movie is being processed.” For a long capture, this can take a half hour.
  7. Done!

So, this is a perfect opportunity to use what I learned in the last chapter of Al Sweigart’s (free) book “Automate the Boring Stuff with Python” regarding automating programs that cannot be manipulated through other means. Al wrote a really useful Python module that uses image recognition to automate, called pyautogui. I have been looking for a good opportunity to use this, because most of the tasks I run into can be broken down into a faster (and more efficient) manipulation type. Elgato’s video capture interface doesn’t let me intervene through an API, so this was perfect.

import pyautogui
import time
import requests
def ifttt_alert():
    requests.post("https://maker.ifttt.com/trigger/YOUR_TRIGGER_NAME/with/key/YOUR_IFTTT_SECRET_KEY")

nosignal = pyautogui.locateOnScreen('no-signal.png', grayscale=True)
while nosignal is None:
    print("Still getting signal.")
    time.sleep(1)
    nosignal = pyautogui.locateCenterOnScreen('no-signal.png', grayscale=True)
    if nosignal is not None:
        print("Looks like the signal went out. Giving it 20 seconds to recover.")
        time.sleep(20)
        nosignal = pyautogui.locateCenterOnScreen('no-signal.png', grayscale=True)
print("Stopping recording.")
stoprecording = pyautogui.locateCenterOnScreen('stoprecording.png', grayscale=True)
pyautogui.click(stoprecording)
time.sleep(10)
continuebutton = pyautogui.locateCenterOnScreen('continue.png', grayscale=True)
pyautogui.click(continuebutton)
pleasewait = pyautogui.locateCenterOnScreen('pleasewait.png', grayscale=True)
while pleasewait is not None:
    time.sleep(10)
    pleasewait = pyautogui.locateCenterOnScreen('pleasewait.png', grayscale=True)
ifttt_alert()
print("Process complete.")

It’s pretty straightforward. The idea is that I have to be there to cue up the tape, so I can manually handle starting the program and hitting play on the camcorder. Then I start this script and it is constantly scraping the screen for an image that looks like this: 
If the script detects that image, then it is going to give the signal 20 seconds to recover. It’s possible that there is just a break on the tape between recordings, and this gives us a chance to see if there is more before we shut down. If 20 seconds do pass, and the image is still found on the screen, we click “Stop Recording”, which looks like this: 

We then give it ten seconds to wrap up it’s process (which is generous, but just in case) and click the “Continue” button. 

Now this is the point where we run into the “Please wait while the movie is being processed” part. So we are going to search the screen every 10 seconds for the following image until it doesn’t exist on the screen anymore.

When it doesn’t exist anymore, the process is finished. As an added bonus, I make an API call to the IFTTT service that will then send a notification to my phone letting me know that it’s time to cue up another tape. The only thing left to do is to rewind the current tape – which I’m actually trying to find out if it’s possible to use an IR blaster to do that!

This was a lot of fun, just had to share.