virtual machine

con­fine­drv

setting up a virtual drive under Linux: how to safely boot an existing OS installation with Qemu by limiting other partitions to read only access.


This page can be viewed online at www.elstel.org/qemu

confinedrv

virtual, partially write protected drives for Linux

At first let us acquaint ourselves with confinedrv the tool used to limit certain partitions to read only access or fade them out completely respecitively. What confinedrv does is create an own virtual drive consisting of read-only and read-write slices of the given base drive. The partition headers are mirrored in read only mode. Note that confinedrv can only control read or write access in chunks of 4096 bytes as this is the memory page size used by Linux nowadays. When creating a new partition scheme do only use sizes or bounds which are a multiple of eight sectors (one is 512 byte) since only a multiple of eight can ensure full access protection; otherwise mirrored partitions can start up to 7 sectors before and end up to 7 sectors later (512*8=4096). This is due to limitations imposed by the paging mechanism of your hardware.

Now let us have a quick test to show the capabilities of confinedrv.

> confinedrv --ra sdx=sdb9,10 dmsetup create sdx </tmp/tmp.nkmDSfW6tz > fdisk -lu /dev/mapper/sdx Disk /dev/mapper/sdx: 640.1 GB, 640135028736 bytes 255 Köpfe, 63 Sektoren/Spur, 77825 Zylinder, zusammen 1250263728 Sektoren Einheiten = Sektoren von 1 × 512 = 512 Bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0006bdb7 Gerät boot. Anfang Ende Blöcke Id System /dev/mapper/sdx1 63 117194174 58597056 83 Linux /dev/mapper/sdx2 * 117194175 1003685887 443245856+ 5 Erweiterte /dev/mapper/sdx5 117194238 195318269 39062016 83 Linux /dev/mapper/sdx6 195318333 199222064 1951866 fd Linux raid autodetect /dev/mapper/sdx7 199225344 277350399 39062528 83 Linux /dev/mapper/sdx8 277352448 765624319 244135936 8e Linux LVM /dev/mapper/sdx9 765626368 843749375 39061504 83 Linux /dev/mapper/sdx10 843751424 1003685887 79967232 8e Linux LVM

Now a virtual drive called /dev/mapper/sdx has just been created. The command line specified that all partitions need to be at least readable (--ra option) and that thus no partitions are faded out completely. Note that even partitions the access has completely been forbidden to will be shown in the partition table because the partition table is transmitted as is, read-only towards our newly mapped drive sdx. The two partitions specified on the command line (sdb9 and sdb10) are the only partitions which will be writeable on sdx.

As far as good we have now mapped sdx but not its individual paritions as shown by fdisk. The entire disk as we have created it now could already be passed as a drive to qemu without any further engagement (see the section 'booting a virtual machine from a confined disk'). For now we just wanna mount one of the partitions of our new drive so let us get the individual sub-paritions mapped by kpartx -a.

> kpartx -a /dev/mapper/sdx > partprobe /dev/mapper/sdx # may be used instead of kpartx > ls /dev/mapper/sdx sdx sdx1 sdx10 sdx2 sdx5 sdx6 sdx7 sdx8 sdx9 > mount /dev/mapper/sdx9 /mnt > ls /mnt bin dev dst home lib64 media mnt2 opt root sbin srv tmp var net src boot dircmp etc lib lost+found mnt mnt3 proc run selinux sys usr dld sh

Now that we have seen it work let us tidy up everything again: kpartx -d unmaps subpartitions and confinedrv -r removes our newly created virtual drive sdx again. Finally we can test with losetup -f for the last free loop device which should again be the same as before our first invocation of confinedrv.

> umount /dev/mapper/sdx9 > kpartx -d /dev/mapper/sdx > ls /dev/mapper/sdx* /dev/mapper/sdx > confinedrv -r sdx > losetup -f /dev/loop0

If you are working with lvm the following commands may be handy to activate or deactivate volume groups as defined by some lvm partitions:

> vgdisplay # show volume groups > vgchange -a y myvg # activate volume group myvg > vgchange -a n # deactivate all volume groups

If you should ever wish to make a single partition read only (i.e. an lvm partition) simply do the following:

> losetup -r /dev/loop7 /dev/lvm/testdeb > blockdev --setro /dev/loop7 > chmod gua-w /dev/loop7 > mount /dev/loop7 /mnt mount: /dev/loop7 is write-protected, mounting read-only

Note that losetup -r should already produce a read only device. Nonetheless tests have shown that the drive may still be writable this way. The blockdev --setro is thus most important. chmod gua-w /dev/loop7 finally completes our possibilities by setting the device node itself for user, group and all the others to read-only. Use losetup -d /dev/loop7 to detach the loop device.

disclaimer

You should still know what you do when you use confinedrv. Do not mount the same partition more than once at the same time except if the partition is mounted read-only two times at the same time. If the very same partition is written while being mounted read only on another host system that system will read stale data not knowing about the changes done to underlying partition by the other system. Note that sdx and sda are still the same drive, just the access rights do differ. Note also that the date and time of the last mount will be written to a non isofs partition on a mount with the -ro (read-only) option. Confine the access rights of this partition if you wanna avoid this information to be written to stable storage.

Qemu - virtual machines with kvm

Qemu is the new integrated virtualization technology of Linux. While the mere qemu is a hardware emulator that has to software-simulate every machine instruction qemu-kvm can directly execute native machine instructions. This works as long as host and guest os need the same hardware i.e. from x86_64 to x86_64 or i686. The only exception to direct machine code execution pose the so called privileged commands (protected mode commands of privilege level 0). If a user program tries to execute such an instruction an exception is raised. At this point virtualization technologies like VMWare emulate the requested privileged instruction. Modern processors with the VTx (Intel) or SVM (AMD) extension as provided by dual core machines have a special hardware mechanism for this that does not only speed up the execution of privileged command but also eases the implementation of virtual machines considerably. Relying on the new hardware mechanism a new, very slim and straight forward virtualization technology has emerged: kvm. Currently qemu-kvm is the only frontend to the kvm kernel module. To make use of kvm check whether the kvm and kvm-intel kernel modules are loaded:

> lsmod | grep kvm kvm_intel 47746 0 kvm 307054 1 kvm_intel

If these kernel modules are present in /lib/modules/yourkernel (yourkernel: uname -r) but refuse to load (modprobe kvm, modprobe kvm_intel) then you may have a problem with the hardware support. Check out whether your processor supports VTx (Intel) or SVM (AMD) and whether these technologies are enabled by the BIOS (Intel, Dell). By having a look at dmesg or /var/log/messages you may find out about the reason why these modules do not load.

Concerning FreeBSD or OpenBSD kvm has been ported for AMD processors. Look here for the current program status.

For a full qemu-kvm experience you may additionally want to install and load (modprobe) the kqemu kernel module

> modprobe kqemu

Under Debian a simple /etc/init.d/qemu-kvm start will do the job and load all required kernel modules so that you can use qemu-kvm.

booting a virtual machine from a confined disk

Now let us enjoye confinedrv by booting one of our alternative operating systems on our hard disk with our current Linux installation or any qemu-kvm capable operating system. Only confinedrv can guarantee us in here that the second operating system will not access any of our partitions in use by the operating system we have already booted. In general it is a good idea to let writeable partitions only be accessed by a single operating system. That way you can safely boot a wracked Linux installation via qemu yielding quite a high confidence that possible malware on that system can not affect any other system on your computer. KQemu is directly maintained by kernel developers and offers good stability and security. Nonetheless there have already been known bugs that allow a sandbox escape for kqemu so check out the security mailing list of your distribution first.

Now let us fire up a console:

> confinedrv --ra sdx=sda dmsetup create sdx </tmp/tmp.NoHga2W7mu > confinedrv --ra sdy=sdb9 dmsetup create sdy </tmp/tmp.yiVyjv01zv > qemu-kvm -m 1G -hda /dev/mapper/sdx -hdb /dev/mapper/sdy -cdrom /dev/sr0 -boot d

virtually replacing and verifying your bootloader

If you would like to have a different bootloader when booting from qemu you may use the mbr=./myfile.mbr command line arguemnt to blend in a file instead of the current Master Boot Record which needs to have the same size as the spare space before the first partition. Be aware that the space backuped by readout-mbr does also contain at least parts of the partition table. We recommend that you prefer to use a BIOS rather than an UEFI based bootloader like lilo when used with qemu only (grub can also be run in BIOS mode). You may also replace existing partitions by previous backups of these partitions provided that the partition and its image file do match in size.

> confinedrv readout-mbr --from sda --into /media/sda-qemu.mbr > lilo -C mylilo.conf -M /media/sda-qemu.mbr > ddrescue /dev/sda5 /media/sda5.part or: dd if=/dev/sda5 bs=8192 of=/media/sda5.part > confinedrv --mbr rw sdy=sda:r3:w5,6 mbr=/media/sda-qemu.mbr sdy5=/media/sda5.part

remark: for good performance blocksize bs may default to the maximum physical block size of the source and target drive; that should work even if the file size is not a multiple of the block size.

You can also blank, reinstall and verify the bootloader against a previously taken image or zeroes. These commands only affect the first 440 bytes, i.e. the real master boot record but not the partition table. Blanking your bootloader may be useful for a hard disk you do not want to boot from by accident. It ascertains that no bootvirus or malware can reside in the bootloader of such a disk.

> confinedrv reinstall-mbr --from myfile.mbr --into /dev/sda ... copy first 440 Byte from file to disk device > confinedrv compare-mbr --from myfile.mbr --with [/dev/]sda ... compare myfile.mbr with the beginning of /dev/sda > confinedrv blank-mbr --into /dev/sda ... overwrite first 440 bytes with zeroes > confinedrv test-mbr-blanked --from [/dev/]sda ... check if first 440 bytes are zero

partition table images

confinedrv also allows to backup the whole partition table; i.e. you do not need to copy the whole disk but only the partition table and as needed individual partitions.

> confinedrv readout-parttbl --from /dev/sda --into my.xx reading out partition table and spare spaces done > confinedrv sdx=1 parttbl=my.xx sdx1=my-sda1.part # You may also write sdx=my1 or sdx= if all partitions shall remain zeroed

sparse files: backing up raw qemu partition images

At first let us create a sparse file with 4GB which only turns out to occupy 12KB on disk. Unused zeroed blocks are not stored in deed on disk for sparse files. You may use such a file with the -hda mydisk.img switch of qemu as disk image. Copy such files with cp --sparse=always.

> dd if=/dev/zero bs=4096 seek=$((4*1024*1024*1024/4096-1)) count=1 of=mydisk.img 1+0 Datensätze ein 1+0 Datensätze aus 4096 Bytes (4,1 kB) kopiert, 0,0828963 s, 49,4 kB/s > ls -l mydisk.img -rw-r--r-- 1 root root 4294967296 Sep 23 16:48 mydisk.img > du -sk mydisk.img 12 mydisk.img

However now when it comes to backup a sparse file on blue ray or if you want to copy it on an ntfs or vfat partition then you would need the whole 4GB on the target device when you simply copy the file. For this use case we have written the sparsefile-rescue and sparsefile-restore utilities:

> sparsefile-rescue --sync if=mydisk.img bs=4096 oi=compacted.simg # --sync synchronizes to disk before quitting and thereby improves error reporting > sparsefile-restore ii=compacted.simg of=my-restored.img

other programs in drvtools (formerly diskutils): binary diff and replace, sha256sum lists & shredding the disk content

The programs bindiff and binpatch allow comparing two binary files block by block. Blocks that differ may be backed up to restore the target file from the source file later on. bindiff is also useful instead of cmp as it does not only show the first byte that differs but all blocks or the whole number of blocks that differ. The binreplace program can replace a portion of a binary file like the bootloader in a disk image. If you would try to use dd instead it would strip (i.e. delete) all the content of the file after the portion that has been replaced.

Some other utilities are included in drvtools like for generating sha256sum lists and for verifying an operating system installation against a previously generated sha256sum list. Furthermore some bash scripts for wiping the disk content based on used disk blocks are inlcuded. However these scripts have only been used once and you will need to read them if you want to apply them because they are neither documented nor extensively tested. They first overwrite all disk blocks once before attempting to overwrite them a second time. One or a few rewrites should be sufficient if an attacker would only access the content electronically. Otherwise you would need a whole lot of repetitions like the shred Linux utility does. It also has a script to generate random character combinations in the length of a file or lvm-volumegroup name in case that you want to delete with shred and thereby also obfuscate the file name.

enjoye.



here it is:

Download:
drvtools v1.7.11   (supports drives with numbers in the name like /dev/nvm0e1)
drvtools v1.7.10   (important bugfix, --skipend feature needs to be given explicitly)
drvtools v1.7.9   (bugfix, --skipend feature needs to be given explicitly)
drvtools v1.7.8   (includes confinedrv v1.7.8)
confinedrv v1.7.7 + man page   (new: mbr image file support)
confinedrv v1.7.6 + man page   (new: partition image files supported)
confinedrv v1.7.2 + man page v1.7   (new: GPT support)
confinedrv v1.7.1 + man page v1.7
confinedrv v1.2.1 + man page
confinedrv v1.1
confinedrv v1.0
*** new *** covered by our gpg-signed software/SHA512SUMS.signed.
Author:
Elws. Stelln­berger elws@elstel.org

Please sign our Contributor License Agreement if you want to contribute code. Otherwise we can not assimilate and re-distribute your changes here at elstel.org

note: chmod +x confinedrv may be required


improvements by version 1.7: known issues