Invoke-WebRequest: Execute HTTP Requests, Download Files, Parse the Web with PowerShell | Ranjan.info

invoke-webrequest The cmdlet can be used to request HTTP/HTTPS/FTP resources directly from the PowerShell console. You can use this command to send HTTP requests (GET and POST), download files from websites, parse HTML web pages, perform authentication, fill out web forms, and submit are, etc. In this article, we will cover basic examples of using . Invoke-WebRequest cmdlet in PowerShell for interacting with web services.

Get Web Page Contents with Invoke-WebRequest Cmdlet

invoke-webrequest The cmdlet allows you to send an HTTP, HTTPS, or FTP request with the GET method to a specified web page and receive a response from the server. This cmdlet is available from PowerShell version 3.0 on Windows.

There are two aliases for the Invoke-WebRequest command in Windows: iwk And wget,

Run the following command:

Invoke-WebRequest -Uri "

Get webpage content using Invoke-WebRequest powershell

tip, If you are connected to the Internet through a proxy server, you must properly configure PowerShell to access the web through the proxy server. If you do not set the proxy parameter, you will receive an error when running the IWK command:

Invoke-WebRequest:  Unable to connect to the remote server
CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebException    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

Invoke-WebRequest: Unable to connect to remote server

The command loaded the page and displayed its contents in the PowerShell console. The response returned is not just the HTML code of the page. The Invoke-WebRequest cmdlet returns an object of type HtmlWebResponseObjecTea. Such an object is a collection of forms, links, images and other important elements of an HTML document. Let’s look at all the properties of this object:

$WebResponseObj = Invoke-WebRequest -Uri "
$WebResponseObj| Get-Member

HtmlWebResponseObject property

To get the raw HTML code of the web page contained in HtmlWebResponseObjecT object, run:

$WebResponseObj.content

You can list the HTML code with the HTTP headers returned by the web server:

$WebResponseObj.rawcontent

Invoke-WebRequest Get html raw code and http status on webpage

You can only see the web server HTTP status code and the HTTP headers of the HTML page:

$WebResponseObj.Headers

As you can see, the server has returned a response 200, This means that the request was successful, and the web server is available and working correctly.

Key               Value
---               -----
Transfer-Encoding chunked
Connection        keep-alive
Vary              Accept-Encoding,Cookie
Cache-Control     max-age=3, must-revalidate
Content-Type      text/html; charset=UTF-8
Date              Wed, 13 Jul 2022 02:28:32 GMT
Server            nginx/1.20.2
X-Powered-By      PHP/5.6.40

To get the last modification time of a web page:

$WebResponseObj.ParsedHtml | Select lastModified

get powershell webpage last modified date

You can specify a user agent string when connecting to a web resource. The built-in user agent in PowerShell is a set of strings:

invoke-webRequest -Uri $uri -userAgent ([Microsoft.PowerShell.Commands.PSUserAgent]::Chrome)

The list of available agents in PowerShell can be displayed like this:

[Microsoft.PowerShell.Commands.PSUserAgent].GetProperties() | Select-Object Name, @{n='UserAgent';e={ [Microsoft.PowerShell.Commands.PSUserAgent]::$($_.Name) }}

powershell: set user agent

Or you can set your own UserAgent string:

Invoke-WebRequest -Uri $uri -UserAgent 'MyApplication/1.1'

Using Invoke-WebRequest with Authentication

Authentication is required to access some web resources. You can use different types of authentication with the Invoke-WebRequest cmdlet (Basic, NTLM, Kerberos, OAuth, or Certificate Authentication).

To perform basic authentication (authentication by name and password encoded in base64), you must first obtain the username and password:

$cred = Get-Credential
wget -Uri ' -Credential $cred

To use the current Windows user credentials to perform NTLM or Kerberos authentication, add the -UseDefaultCredentials option:

Invoke-WebRequest ' -UseDefaultCredentials

DefaultCredentials is not working with Basic Authentication.

To authenticate with a certificate, you must specify its thumbprint:

Invoke-WebRequest ' -CertificateThumbprint xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

You can use Modern Bearer / OAuth Token Authentication in your PowerShell scripts.

  1. First, you need to obtain an OAuth token from your REST API provider (outside the scope of this post);
  2. Replace token with ConvertTo-SecureString cmdlet: $Token = "12345678912345678901234567890" | ConvertTo-SecureString -AsPlainText –Force
  3. Now you can do OAuth authentication:
    $Params = @{
    Uri = "
    Authentication = "Bearer"
    Token = $Token }
    Invoke-RestMethod @Params

Parse and Scrape a Web Page to HTML with PowerShell

The Invoke-WebRequest cmdlet allows you to quickly and easily parse the content of any web page. When processing an HTML page, a collection of links, web forms, images, scripts, etc. is created.

Let’s see how to access specific items on a web page. For example, I want to get a list of all outgoing links (an HREF objects) to the target HTML web page:

$SiteAdress = "
$HttpContent = Invoke-WebRequest -URI $SiteAdress
$HttpContent.Links | Foreach {$_.href }

Create Invoke-WebRequest html hrefs collection

To get the link text itself (contained in the innerText element), you can use the following command:

$HttpContent.Links | fl innerText, href

You can only select links to a specific CSS class:

$HttpContent.Links | Where-Object {$_.class -eq "page-numbers"} | fl innerText, href

or specific text in the URL address:

$HttpContent.Links | Where-Object {$_.href -like "*powershell*"} | fl innerText,href

Filtering HTML Objects with Posh

Then display a list of all the images on this page:

$Img.Images

Create a collection of absolute URL paths for these images:

$images = $Img.Images | select src

Initialize a new instance of the WebClient class:

$wc = New-Object System.Net.WebClient

And download all the image files (with their original file names) from the page to the c:\too1s\ folder:

$images | foreach { $wc.DownloadFile( $_.src, ("c:\tools\"+[io.path]::GetFileName($_.src) ) ) }

Parsing Invoke-WebRequest html page

How to download a file over HTTP/FTP with PowerShell Wget (Invoke-WebRequest)?

Invoke-WebRequest allows you to download files from a web page or FTP site (works like Wget or cURL on Windows). Suppose, you want to download a file over HTTP using PowerShell. Run the following PowerShell command:

wget " -outfile “c:\tools\firefox_setup.exe”

powershell: using wget alias to download http/https file

This command will download the file from the HTTP site and save it to the specified directory.

You can get the size of a file in MB before downloading it with wget:

$url = "
(Invoke-WebRequest $url -Method Head).Headers.'Content-Length'/1Mb

wget: powershell get file size by http link

Below is an example of a PowerShell script that will find all links to *.pdf files on a target web page and bulk download all files from the website to your computer (each PDF file is saved under a random name):

$OutDir="C:\docs\download\PDF"
$SiteAdress = "https://sometechdocs.com/pdf"
$HttpContent = Invoke-WebRequest -URI $SiteAdress
$HttpContent.Links | Where-Object {$_.href -like "*.pdf"} | %{Invoke-WebRequest -Uri $_.href -OutFile ($OutDir + $(Get-Random 200000)+".pdf")}

As a result of the script in the target directory, all PDF files from the page will be downloaded. Each file is saved under a random name.

In modern PowerShell Core (6.1 and newer), the Invoke-WebRequest cmdlet supports resume mode. Update your version of powershell core and you can use -resume Options on the Invoke-WebRequest command to resume file downloads in case the communication channel or server is unavailable:

Invoke-WebRequest -Uri $Uri -OutFile $OutFile –Resume

How to fill and submit HTML forms with PowerShell?

Many web services require you to fill in various data in HTML forms. With Invoke-WebRequest, you can access any HTML form, fill in the required fields, and submit the filled form back to the server. In this example, we’ll show how to sign in through Facebook’s standard web form using PowerShell.

Fill out and send Facebook login form with Powershell

With the following command, we will save information about connection cookies in a separate session variable:

$fbauth = Invoke-WebRequest -SessionVariable session

Display the list of fields to fill in the login HTML form (login_form), using the next command:

$fbauth.Forms["login_form"].Fields

Specify the desired value in all fields:

$fbauth.Forms["login_form"].Fields["email"] = "[email protected]"

$fbauth.Forms["login_form"].Fields["pass"] = "Coo1P$wd"

e.t.c.

To submit the completed form to the server, call the HTML form’s action attribute:

$Log = Invoke-WebRequest -method POST -URI ("" + $fbauth.Forms["login_form"].Action) -Body $fbauth.Forms["login_form"].Fields -WebSession $session

You can also use the JSON format to send data to a web page with the POST method:

$headers = @{
'Content-Type'='application/json'
'apikey'='0987654321'
}
$jsonbody = @{
"siteUrl" ="
"email" = "[email protected]"
}
Invoke-WebRequest -Method 'Post' -Uri $url -Body ($jsonbody |ConvertTo-Json) -Headers $headers -ContentType "application/json"

Invoke-WebRequest: Ignore SSL/TLS certificate verification

Another problem is that the Invoke-WebRequest cmdlet is closely related to Internet Explorer. For example, in Windows Server Core versions in which IE is not installed (or removed), the Invoke-WebRequest cmdlet cannot be used.

Invoke-WebRequest: The response content cannot be parsed because the Internet Explorer engine is not available, or Internet Explorer’s first-launch configuration is not complete. Specify the UseBasicParsing parameter and try again.

In this case, the WebClient class can be used instead of Invoke-WebRequest. For example, to download a file from a specified URL, use the command .

(New-Object -TypeName 'System.Net.WebClient').DownloadFile($Url, $FileName)

If an invalid SSL certificate is used on an HTTPS site, or PowerShell does not support this type of SSL/TLS protocol, the Invoke-WebRequest cmdlet drops the connection.

Invoke-WebRequest : The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel.
Invoke-WebRequest : The request was aborted: Could not create SSL/TLS secure channel.

Invoke-WebRequest PowerShell: Could not establish trust relationship SSL TLS secure channel

By default, Windows PowerShell (in early builds of Windows 10, Windows Server 2016 and earlier versions of Windows) uses the legacy and insecure TLS 1.0 protocol for connections (check the blog post describing the PowerShell module installation error :install-module: Unable to download from URI).

If the legacy TLS 1.0 and TLS 1.1 protocols are not disabled in Windows, you must run the following command to use TLS 1.2 in a PowerShell connection:

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

If a self-signed certificate is used on an HTTPS site, the Invoke-WebRequest cmdlet refuses to receive data from it. To ignore (discard) invalid SSL certificates, use the following PowerShell code:

add-type @"
using System.Net;
using System.Security.Cryptography.X509Certificates;
public class TrustAllCertsPolicy : ICertificatePolicy {
public bool CheckValidationResult(
ServicePoint srvPoint, X509Certificate certificate,
WebRequest request, int certificateProblem) {
return true;
}
}
"@
$AllProtocols = [System.Net.SecurityProtocolType]'Ssl3,Tls,Tls11,Tls12'
[System.Net.ServicePointManager]::SecurityProtocol = $AllProtocols
[System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy
$result = Invoke-WebRequest -Uri "

In PowerShell Core, the Invoke-WebRequest cmdlet has an additional parameter –skip certificate check Which allows you to ignore invalid SSL/TLS certificates.

Another significant drawback of the Invoke-WebRequest cmdlet is its low performance. When downloading a file, the HTTP stream is completely buffered in memory, and saved to disk only after the full download is complete. Thus, when downloading large files using Invoke-WebReques, you may run out of RAM.

Leave a Comment