download and convert multiple youtube links
so i wanted to get music from youtube
this is the simple version, it gets files one by one, it only supports video urls (no playlists) you need to have youtube-dl installed (latest version) i got it from here https://packages.debian.org/sid/all/youtube-dl/download just put your youtube urls in file called "list0.txt" one on each line it downloads m4a (audio only) and then converts to mp3 Code:
#!/bin/bash this is the second version, it does stuff in parallel for this you also need to install aria2 and gnu parallel just put your youtube urls (videos/playlists/channels) in "list0.txt" one on each line Code:
#!/bin/bash tell me what you think |
new version 20190820
download audio of videos with more than 1_000_000 views from multiple channels m4a, and convert to mp3 Code:
#!/bin/bash |
Quote:
Code:
cat list1.txt |sed 's/": "/__/g ; s/"/_/g; s/,/|/g ;'|grep -Po '_id__(.{11})_'|sort|uniq | sed 's/_id__//g;s/_$//g' |grep -v \| >list2.txt Code:
youtube-dl -a list0.txt -j --flat-playlist --no-check-certificate > list1.txt Instead of Code:
find . > file1 Code:
while read i; Code:
find . -name "*.m4a" |parallel -j4 ffmpeg -i "{}" -acodec libmp3lame -aq 2 "{}.mp3" Code:
"{}.mp3" Code:
parallel echo {} {.} {.}.mp3 ::: test.mp4 Code:
#mkdir mp & mv *mp3 mp/ & mkdir m4 & mv *m4a m4/ Code:
parallel ffmpeg -i {} mp3_dir/{.}.mp3 ::: *mp4 I threw this together Code:
ytaudio() { parallel "youtube-dl -qx --audio-format 'mp3' -o '%(title)s.%(ext)s' --restrict-filenames {} && echo Processed {}" ::: $@; } Code:
ytaudio link1 [linkn] [playlistlinkn] Code:
# ytaudio() { # We're defining a bash function here |
can you provide sample lists?
anyway, after a very quick look I just wanted to clean some things up no need for cat http://porkmail.org/era/unix/award.html Code:
cat list1.txt |sed 's/": "/__/g ; s/"/_/g; s/,/|/g ;'|grep -Po '_id__(.{11})_'|sort|uniq | sed 's/_id__//g;s/_$//g' |grep -v \| >list2.txt Code:
<list1.txt sed 's/": "/__/g ; s/"/_/g; s/,/|/g ;' many ways to do this, here is one with sed Code:
<list1.txt sed -n 's/.*__id_\([a-Z0-9]\+\)_.*/\1/p' well. I substitued the whole line with what was found in the () you can save multiple patterns and reorder e.g. Code:
# edit didn't reorder some junk before __id_gu7d54djkl_ more junk at end outputs gu7d54djkl__id_ looking back at your pipe to pipe to pipe again... it looks like you created the __id_ placeholders are you sedding json data? if so you may find jq useful it has a steep learning curve, but well worth it as an example, some output from api.tmdb.org Code:
{"page":1,"total_results":2,"total_pages":1,"results":[{"vote_count":13,"id":45049,"video":false,"vote_average":7.6,"title":"The Code","popularity":0.731,"poster_path":"\/fvIEpbgUS45JLZg6OZpq6ke9wOI.jpg","original_language":"en","original_title":"The Code","genre_ids":[28,53,99],"backdrop_path":"\/kwG1vm97uUFTiGhTgiJYr9aB0AM.jpg","adult":false,"overview":"The Code is a Finnish-made documentary about Linux, featuring some of the most influential people of the free software movement.","release_date":"2001-09-26"},{"vote_count":0,"id":243915,"video":false,"vote_average":0,"title":"LINUX die Reise des Pinguins","popularity":0.6,"poster_path":null,"original_language":"de","original_title":"LINUX die Reise des Pinguins","genre_ids":[99],"backdrop_path":null,"adult":false,"overview":"","release_date":"2009-03-14"}]} but with jq Code:
<tmdb_api_output.json jq -C "." Code:
jq -r ".results[0]|.id,.title,.release_date,.overview,.vote_count,.vote_average" Code:
45049 I even wrote a very nasty sed script to convert puluseaudio's `pacmd list-sink-inputs` output to json because it was so much nicer to use jq to automate stuff tl;dr use jq instead of sed |
oops, forgot when I got carried away with jq
Code:
....|sort|uniq... Code:
sort -u play around with this Code:
while read yt_id;do something else you can try Code:
raw_ytdl_json=$( Code:
<<<$raw_ytdl_json jq -r "._filename[2:13]" https://stedolan.github.io/jq/manual/ Quote:
the first example is better I would probable stick them into a bash array and work with them later something like Code:
...... |
ok, so I got bored
This is quite dumb, not secure to go piping stuff directly into ffmpeg but for fun I came up with this Code:
#!/bin/bash and having ffmpeg use stdin from some random internet page is asking for trouble It would be *much* safer to dl the m4a and then use ffmpeg after testing that the m4a is actually aac data ( which is what you have already been doing ;) ) I just thought it would be fun to skip that step see if you can come up with some checks on the ids and urls, that they are in the expected length/format. maybe use parallel to start a chain of Get_m4a && check_m4a && ffmpeg_to_mp3 since jq is a shiny new toy Code:
mediainfo --Output=JSON |
All times are GMT -5. The time now is 11:43 PM. |