HOME
書籍
エンジニア入門シリーズ
ゼロからマスター！Colab×Pythonでバイナリファイル解析実践ガイド

会員登録するとページ閲覧だけでポイント貯蓄！
最大500ポイント利用可能！

エンジニア入門シリーズ

ゼロからマスター！
Colab×Pythonで
バイナリファイル解析実践ガイド

監修：	大塚玲氏
著：	大坪雄平氏／萬谷暢崇氏／羽田大樹氏／染谷実奈美氏
定価：	4,400円（本体4,000円＋税）
判型：	B5変型
ページ数：	416 ページ
ISBN：	978-4-910558-35-6
発売日：	2024/10/17
管理No：	133

目次
参考文献
口コミ

【目次】

第１章　バイナリ解析に向けた準備運動

１.１　Pythonである理由
1. １.１.１　シンプルで読みやすいコード
2. １.１.２　豊富なライブラリ
3. １.１.３　オープンソースで活発なコミュニティ
１.２　プログラミング環境構築
1. １.２.１　Colabの特徴
2. １.２.２　Googleドライブ上にNotebookを作成する
3. １.２.３　プログラミング環境の設定変更と確認
4. １.２.４　セルにコードを入力して実行する
5. １.２.５　Colab Notebookの機能
１.３　Pythonの基本
1. １.３.１　Pythonの演算処理
2. １.３.２　変数と主なデータ型
3. １.３.３　コメント
4. １.３.４　コレクション
5. １.３.５　制御フロー
6. １.３.６　関数
１.４　Pythonでバイナリを扱う準備
1. １.４.１　整数の表記法
2. １.４.２　整数型から2進数、8進数、16進数表記の文字列型への変換
3. １.４.３　2進数、8進数、16進数表記の文字列型から整数型への変換
4. １.４.４　Pythonにおける整数型の実装の深堀り
5. １.４.５　ビット演算
１.５　バイナリシーケンス型
1. １.５.１　bytesオブジェクト
2. １.５.２　bytearrayオブジェクト
3. １.５.３　bytesとbytearrayの基本操作
１.６　各種エンコード
1. １.６.１　シーザー暗号
2. １.６.２　Base64.
3. １.６.３　Deflate.
１.７　バイナリデータを扱う練習：Base64相互変換関数の自作
コラム：本書からの挑戦状：名刺でＣＴＦ

第２章　バイナリファイルの操作

２.１　バイナリファイルの読み書き
1. ２.１.１　解析用ファイルの準備
2. ２.１.２　ファイル操作の基本
3. ２.１.３　セッションストレージにファイルをアップロード
4. ２.１.４　Googleドライブ上のファイルの参照
２.２　ファイル全体の俯瞰
1. ２.２.１　16進ダンプしてみよう
2. ２.２.２　練習：stringsを自作してみよう
3. ２.２.３　ヒストグラムを用いたファイル全体の俯瞰
4. ２.２.４　情報エントロピーを用いたファイル全体の俯瞰
5. ２.２.５　画像化によるファイル全体の俯瞰
コラム：バイナリエディタStirlingと目grep

第３章　バイナリファイルの構造解析の練習：画像ファイル

３.１　バイナリファイルのファイル構造
1. ３.１.１　ファイル構造の基本要素
３.２　ファイル形式の判定
1. ３.２.１　ライブラリを使ったファイル形式の判定
2. ３.２.２　簡易的なファイル形式判定プログラムの自作
3. ３.２.３　Magika：AIを使ったファイル形式の判定
３.３　BMP形式
1. ３.３.１　解析用BMPファイルの準備
2. ３.３.２　ライブラリを使った解析
3. ３.３.３　BMPファイルのフォーマット概要
コラム：コンピューターの性能向上とカラーパレットの衰退
1. ３.３.４　structモジュールの紹介
コラム：C文字列とPascal文字列：メモリ上の文字列表現
1. ３.３.５　BMPファイルの情報を表示するプログラムの作成
2. ３.３.６　応用：BMPファイルを使ったステガノグラフィ
コラム：新技術と研究倫理～ステガノグラフィを例に～
３.４　PNG形式
1. ３.４.１　解析用PNGファイルの準備
2. ３.４.２　ライブラリを使った解析
3. ３.４.３　PNGファイルの構造
4. ３.４.４　PNGファイルを分解するプログラムの作成
コラム：GIFの特許問題とPNGの誕生
３.５　JPEG形式
1. ３.５.１　解析用JPEGファイルの準備
2. ３.５.２　ライブラリを使った解析
3. ３.５.３　JPEGファイルの構造
4. ３.５.４　JPEGファイルを分解するプログラムの作成
コラム：なぜ便利なライブラリがあるのに自作するのか

第４章　バイナリファイルの構造解析実践編：コンテナファイル（アーカイブ、文書ファイル）

４.１　zip形式
1. ４.１.１　解析用zipファイルの準備
2. ４.１.２　ライブラリを使った解析
3. ４.１.３　zipファイルフォーマットの構造
4. ４.１.４　zip展開プログラムの自作
5. ４.１.５　応用：破損したzipファイルの復元
コラム：解析業務とアジャイル開発
４.２　PDF形式
1. ４.２.１　解析用PDFファイルの準備
2. ４.２.２　ライブラリを使った解析
3. ４.２.３　PDFの仕様概要
4. ４.２.４　PDFを解析しながら仕様を学ぶ
コラム：o-checker：ファイルフォーマットに着目したマルウェア検知ツール

第５章　応用編１　バイナリファイル解析の道具箱Binary Refinery

５.１　Binary Refineryとは
1. ５.１.１　実行環境のセットアップ
５.２　Binary Refineryのドキュメントとヘルプ
５.３　入出力に使う機能
1. ５.３.１　ファイルの内容の出力
2. ５.３.２　データに関する情報の出力
3. ５.３.３　efコマンドの機能を使ったPythonスクリプト
4. ５.３.４　手入力によるバイナリデータの出力
5. ５.３.５　emitコマンドの機能を使ったPythonスクリプト
6. ５.３.６　ファイルへの出力
7. ５.３.７　dumpコマンドの機能を使ったPythonスクリプト
５.４　データの表示に使う機能
1. ５.４.１　ファイルの内容の16進ダンプ表示
2. ５.４.２　peekコマンドの機能を使ったPythonスクリプト
５.５　データの切り出しに使う機能
1. ５.５.１　範囲指定によるデータの切り出し
2. ５.５.２　データの分割
3. ５.５.３　snipコマンドとchopコマンドの機能を使ったPythonスクリプト
５.６　バイナリと数値の変換に使う機能
1. ５.６.１　16進数からのデコード
2. ５.６.２　2進数、8進数等のデコード
3. ５.６.３　packコマンドの機能を使ったPythonスクリプト
５.７　ビット演算に使う機能
1. ５.７.１　加算と減算
2. ５.７.２　addコマンドとsubコマンドの機能を使ったPythonスクリプト
3. ５.７.３　ビットの反転、ローテート、シフト
4. ５.７.４　negコマンド等の機能を使ったPythonスクリプト
５.８　XOR演算関係の機能
1. ５.８.１　ＸＯＲ演算
2. ５.８.２　xorコマンドとrotlコマンドの機能を組み合わせたPythonスクリプト
3. ５.８.３　XORキーの推定
4. ５.８.４　xkeyコマンドがXＯＲキーを推定する方法
5. ５.８.５　XORキー推定による自動デコード
6. ５.８.６　xkeyコマンドの機能を使ったPythonスクリプト
コラム：xkeyコマンドとautoxorコマンドが役立つ場面
５.９　デコードとエンコードに使う機能
1. ５.９.１　Base64デコード
2. ５.９.２　カスタム文字セットを使ったBase64エンコードとデコード
3. ５.９.３　バイト列の逆順への並べ替え
4. ５.９.４　各ブロック内のバイト列の逆順への並べ替え
5. ５.９.５　シーザー暗号
6. ５.９.６　URLデコード
7. ５.９.７　revコマンド等の機能を組み合わせたPythonスクリプト
５.１０　圧縮関係の機能
1. ５.１０.１　zlibで圧縮されたデータの展開
2. ５.１０.２　LZMAで圧縮されたデータの展開
3. ５.１０.３　様々な圧縮アルゴリズムによる圧縮データの自動展開
4. ５.１０.４　decompressコマンドの機能を使ったPythonスクリプト
５.１１　その他の機能
1. ５.１１.１　暗号関係の機能
2. ５.１１.２　ファイルの構造をパースする機能
3. ５.１１.３　ファイルカービングの機能
５.１２　演習：難読化されたPHPスクリプトの解析
1. ５.１２.１　解析対象のPHPスクリプト
2. ５.１２.２　可読化方法の解明
3. ５.１２.３　難読化された部分を可読化するPythonスクリプト
4. ５.１２.４　別解：コマンド実行による可読化
5. ５.１２.５　難読化を行うPHPスクリプト
コラム：Binary Refineryを活用して開発しているバイナリファイル解析ツール

第６章　バイナリファイルの構造解析　実践編：実行ファイル

６.１　解析用ファイルの準備
６.２　ELF解析ライブラリ：elftoolsの準備
６.３　ELFファイルの構造
1. ６.３.１　ELFヘッダーの構造
2. ６.３.２　セクションヘッダーの構造
６.４　最初に実行されるプログラムコードの取得
６.５　Pythonで逆アセンブル
コラム：パッカーとエントリーポイント
６.６　アセンブリコードの読み方入門
1. ６.６.１　（逆）コンパイルと（逆）アセンブル
2. ６.６.２　アセンブリコードを読む自習の準備
3. ６.６.３　アセンブリの文法
4. ６.６.４　スタック
5. ６.６.５　条件分岐：フラグレジスタとジャンプ命令
6. ６.６.６　関数
7. ６.６.７　命令セット

第７章　応用編２　バイナリ解析実践CTF

７.１　CTFとバイナリ解析
７.２　x86-64プログラムの解析
1. ７.２.１　例題1：メモリに展開されたパスワード
2. ７.２.２　例題2：ソルバーを作成してパスワードを推測
3. ７.２.３　例題3：ビット演算の難読化
７.３　Pythonバイトコードの解析
1. ７.３.１　Pythonバイトコードを解析する方法
2. ７.３.２　例題4：pycを解析する問題
７.４　本章のまとめ
コラム：CTF（Capture The Flag）のススメ

第８章　応用編３　機械学習を用いたバイナリ解析～マルウェアの種類推定を例に～

８.１　マルウェアとは
８.２　機械学習とは
８.３　マルウェア解析と機械学習
1. ８.３.１　マルウェア解析
2. ８.３.２　機械学習によるマルウェア解析
８.４　特徴量の作成
８.５　グラフニューラルネットワークを使用したマルウェア分類
1. ８.５.１　グラフニューラルネットワーク
2. ８.５.２　データセットの確認
3. ８.５.３　学習と評価
８.６　独自のデータセットを作成する方法
1. ８.６.１　仮想環境とは
2. ８.６.２　マルウェアを扱う際の注意点
3. ８.６.３　VirtualBoxのインストール
4. ８.６.４　仮想マシンの準備
5. ８.６.５　データセットの作成
6. ８.６.６　データセットの取り出し
7. ８.６.７　仮想マシンをクリーンな環境に戻す
８.７　機械学習を用いたマルウェア分類における課題と展望
８.８　まとめ
コラム：様々なサイバーセキュリティ分野の学術会議
おわりに

付録

付録Ａ　Pythonのバイナリデータ操作のチートシート
付録Ｂ　各数値表記とＡＳＣＩＩの対応表
付録Ｃ　Colab以外の環境で使用できる便利なバイナリファイル解析ツール達

【参考文献】

高田寛、“プログラムのリバースエンジニアリングの法的課題” 企業法学研究2020 第9 巻2 号(2020)(https://www.jabl.org/kigyouhougakukenkyuu2020Vo.9No.2_Takada.pdf)
https://jupyter.org/
https://colab.research.google.com/
https://drive.google.com/
https://docs.python.org/ja/3/library/stdtypes.html#common-sequence-operations
https://docs.python.org/ja/3/reference/lexical_analysis.html#f-strings
https://docs.python.org/ja/3/library/functions.html#int
https://docs.python.org/ja/3/library/stdtypes.html#bytes
https://docs.python.org/ja/3/library/stdtypes.html#bytearray
https://docs.python.org/ja/3/library/stdtypes.html#bytes-methods
https://docs.python.org/ja/3/library/codecs.html#standard-encodings
https://docs.python.org/ja/3/library/codecs.html
https://datatracker.ietf.org/doc/html/rfc1421
https://datatracker.ietf.org/doc/html/rfc1950
https://datatracker.ietf.org/doc/html/rfc1952
https://datatracker.ietf.org/doc/html/rfc1951
MIL-STD-188-100, Military Standard: Common Long Haul Tactical Communication System Technical Standards, 1972
Shannon, Claude Elwood. "A mathematical theory of communication." The Bell system technical journal 27.3 (1948): 379-423.
https://numpy.org/doc/stable/user/basics.types.html
https://numpy.org/doc/stable/user/basics.creation.html
https://www.vector.co.jp/soft/win95/util/se079072.html
https://github.com/yotsubo/o-glasses
https://github.com/yotsubo/o-glassesX
https://github.com/yotsubo/o-glasses2023
https://github.com/ahupp/python-magic
https://www.darwinsys.com/file/
https://en.m.wikipedia.org/wiki/List_of_file_signatures
https://opensource.googleblog.com/2024/02/magika-ai-powered-fast-and-efficient-file-typeidentification.html
https://github.com/google/magika
https://google.github.io/magika/
https://docs.python.org/ja/3/library/pathlib.html
https://python-pillow.org/
https://www.colordic.org/s
https://docs.python.org/ja/3/library/struct.html
https://opencv.org/
https://scikit-image.org/
https://pypi.org/project/pypng/
http://www.libpng.org/pub/png/spec/1.2/png-1.2.pdf
PNG Signature: http://www.libpng.org/pub/png/book/chapter08.html
https://docs.python.org/ja/3/library/zlib.html
https://pillow.readthedocs.io/en/stable/_modules/PIL/PngImagePlugin.html
（再掲）http://www.libpng.org/pub/png/spec/1.2/png-1.2.pdf
https://ftp-osl.osuosl.org/pub/libpng/documents/pngext-1.5.0.html
http://www.libpng.org/pub/png/#history
https://developers.google.com/speed/webp?hl=ja
https://aomediacodec.github.io/av1-avif/
https://docs.python.org/3/library/zipfile.html
https://pkwaredownloads.blob.core.windows.net/pkware-general/Documentation/APPNOTE-6.3.9.TXT など
https://pypi.org/project/pypdf/
https://pypdf.readthedocs.io/en/stable/index.html
https://pypi.org/project/pdfminer.six/
https://pypi.org/project/pdfminer/
https://pdfminersix.readthedocs.io/en/latest/faq.html#why-are-there-cid-x-values-in-the-textual-output
https://pdfminersix.readthedocs.io/en/latest/index.html
https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf
https://github.com/binref/refinery
https://github.com/binref/refinery/tree/master/tutorials
https://gchq.github.io/CyberChef/
https://git-scm.com/
https://binref.github.io/
https://datatracker.ietf.org/doc/html/rfc1321
https://datatracker.ietf.org/doc/html/rfc3174
https://datatracker.ietf.org/doc/html/rfc6234
https://docs.python.org/ja/3/library/argparse.html
https://docs.python.org/ja/3/howto/functional.html#generators
https://docs.python.org/ja/3/tutorial/controlflow.html#pass-statements
https://docs.python.org/ja/3/tutorial/errors.html
https://docs.python.org/ja/3/library/argparse.html#mutual-exclusion
三村守, 大坪雄平, 田中英彦. "悪性文書ファイルに埋め込まれたRAT の検知手法." 情報処理学会論文誌55.2(2014): 1089-1099. https://lab.iisec.ac.jp/~tanaka_lab/images/pdf/journal/journal-2014-01.pdf
https://datatracker.ietf.org/doc/html/rfc4648
https://datatracker.ietf.org/doc/html/rfc4648#section-6
https://www.dcode.fr/base92-encoding#q1
https://datatracker.ietf.org/doc/html/rfc4648#section-5
https://docs.python.org/ja/3/library/urllib.parse.html#url-quoting
https://datatracker.ietf.org/doc/html/rfc1950
https://datatracker.ietf.org/doc/html/rfc1952
https://tukaani.org/xz/xz-file-format.txt
https://www.7-zip.org/sdk.html
https://www.sourceware.org/bzip2/
https://www.php.net/manual/ja/function.str-rot13
https://www.php.net/manual/ja/function.strrev
https://www.php.net/manual/ja/function.base64-decode
https://www.php.net/manual/ja/function.gzinflate
https://www.php.net/manual/ja/function.eval
https://www.php.net/manual/ja/function.strcmp
https://www.php.net/manual/ja/language.types.string.php#language.types.string.syntax.heredoc
https://downloadcenter.trellix.com/products/mcafee-avert/fileinsight.msi
https://github.com/nmantani/FileInsight-plugins/blob/master/README.ja.md
https://codeblue.jp/2019/bluebox/FileInsight-plugins/
https://www.blackhat.com/us-21/arsenal/schedule/
https://github.com/binref/refinery/issues/24
https://github.com/binref/refinery/pull/25
https://github.com/binref/refinery/issues/46
https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
https://www.capstone-engine.org/lang_python.html
https://github.com/lattera/glibc/blob/master/sysdeps/x86_64/start.S
Isawa, Ryoichi, et al. "An accurate packer identification method using support vector machine." IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 97.1 (2014): 253-263.
大坪雄平, 大塚玲, 岩田吉弘, 三村守, 榊剛史. "転移学習による機械語命令列分類における学習の効率化." 人工知能学会全国大会論文集第33 回(2019). 一般社団法人人工知能学会, 2019.
https://en.wikipedia.org/wiki/X86_calling_conventions
https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build
https://rada.re/
https://docs.python.org/ja/3/library/dis.html
https://docs.python.org/ja/3/library/marshal.html
https://ctftime.org/
https://github.com/sajjadium/ctf-archives
https://policies.google.com/terms
https://github.com/elastic/ember
https://lief.re
Nataraj, Lakshmanan, et al. "Malware images: visualization and automatic classification." Proceedings of the 8th international symposium on visualization for cyber security. 2011.
Raff, Edward, et al. "Malware detection by eating a whole exe." Workshops at the thirty-second AAAI conference on artificial intelligence. 2018.
Minami Someya, Yuhei Otsubo, and Akira Otsuka. "FCGAT: Interpretable malware classification method using function call graph and attention mechanism." Proc. Netw. Distrib. Syst. Secur.(NDSS) Symp.2023.
CBOW: Mikolov, Tomas et al. “Efficient Estimation of Word Representations in Vector Space.” International Conference on Learning Representations (2013).
skip-gram: Mikolov, Tomas et al. “Distributed Representations of Words and Phrases and their Compositionality.” Neural Information Processing Systems (2013).
https://rada.re/
https://radimrehurek.com/gensim
https://scikit-learn.org/stable/
Scarselli, F, et al. (2008). The graph neural network model. IEEE transactions on neural networks, 20(1), 61-80.
35 P. Veličković, et al."Graph Attention Networks"6th International Conference on Learning Representations.2017.
https://pytorch.org
https://yutaroogawa.github.io/pytorch_tutorials_jp
https://github.com/pyg-team/pytorch_geometric
https://whyisyoung.github.io/BODMAS
https://docs.python.org/ja/3/library/pickle.html
https://seaborn.pydata.org/
https://www.virtualbox.org/
https://www.vmware.com/products/workstation-pro/html.html
https://learn.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/about/
https://www.oracle.com/jp/virtualization/technologies/vm/downloads/virtualbox-downloads.html
https://releases.ubuntu.com/jammy/
https://docs.python.org/ja/3/library/functions.html#repr
https://docs.python.org/ja/3/library/functions.html#eval

【口コミ】

※口コミはありません。

ゼロからマスター！Colab×Pythonでバイナリファイル解析実践ガイド

【目次】

第１章 バイナリ解析に向けた準備運動

第２章 バイナリファイルの操作

第３章 バイナリファイルの構造解析の練習：画像ファイル

第４章 バイナリファイルの構造解析実践編：コンテナファイル（アーカイブ、文書ファイル）

第５章 応用編１ バイナリファイル解析の道具箱Binary Refinery

第６章 バイナリファイルの構造解析 実践編：実行ファイル

第７章 応用編２ バイナリ解析実践CTF

第８章 応用編３ 機械学習を用いたバイナリ解析～マルウェアの種類推定を例に～

付録

【参考文献】

【口コミ】

ゼロからマスター！
Colab×Pythonで
バイナリファイル解析実践ガイド

第１章　バイナリ解析に向けた準備運動

第２章　バイナリファイルの操作

第３章　バイナリファイルの構造解析の練習：画像ファイル

第４章　バイナリファイルの構造解析実践編：コンテナファイル（アーカイブ、文書ファイル）

第５章　応用編１　バイナリファイル解析の道具箱Binary Refinery

第６章　バイナリファイルの構造解析　実践編：実行ファイル

第７章　応用編２　バイナリ解析実践CTF

第８章　応用編３　機械学習を用いたバイナリ解析～マルウェアの種類推定を例に～