Subject: 1.700: Free LVM lecture slides. From: shieh@austin.ibm.com If you want free LVM documentation (lecture notes) from the SHARE conference in San Franciso where I presented last March just: mail -s "S_basics.ps" shieh@austin.ibm.com < /dev/null mail -s "S_limits.ps" shieh@austin.ibm.com < /dev/null mail -s "S_lvm_extra.ps" shieh@austin.ibm.com < /dev/null [Editor's note: Jens-Uwe Mager converted the slides to PDF format. They are available as .] ------------------------------ Subject: 1.701: How do I shrink /usr? From: mike@bria.UUCP (Michael Stefanik) and Richard Hasting FOR AIX 3.1 ----------- 1) Make a backup of /usr find /usr -print | backup -ivf /dev/rmt0 (or appropriate device) 2) shutdown to maintenance mode shutdown -Fm 3) export LANG=C 4) remove the filesystem and the logical volume ignore an error about the "dspmsg" command not found umount /usr rmfs /usr 5) make a new logical volume hd2 and place it on rootvg with desired size mklv -yhd2 -a'e' rootvg NNN where NNN is the number of 4 meg partitions 6) create a filesystem on /dev/hd2 crfs -vjfs -dhd2 -m'/usr' -Ayes -p'rw' 7) mount the new /usr filesystem and check it /etc/mount /usr df -v 8) restore from the tape; system won't reboot otherwise restore -xvf/dev/rmt0 9) Sync and reboot the system; you now have a smaller /usr filesystem FOR AIX 3.2 ----------- 0) Experiences posted to comp.unix.aix lead me to suggest that many administrators find the following piece of information useful after completing this procedure. I thought some of you might like to read it BEFORE getting yourself into this predicament. Call 1-800-IBM-4FAX and request document 2503 dated 1/26/94. Title is "How to recover if all files are owned by root after restoration from a mksysb tape". 1) Remove any unneeded files from /usr. 2) Make sure all filesystems in the root volume group are mounted. If not, they will not be included in the re-installed system. 3) Type mkszfile. This will create /.fs.size that contains a list of the active filesystems in the root volume group that will be included in the installation procedure. 4) Edit .fs.size. Change the size of /usr to what you want. Example: This .fs.size file shows /usr to be 40MB. rootvg 4 hd2 /usr 10 40 jfs The 10 is the number of physical partitions for the filesystem and the 40 is 40 MB. Most systems have a physical partition size of 4 MB. Therefore, the second number (40) will always be 4 times the previous number (10). Note, however, that a model 320 with a 120 MB drive will have a physical partition size of only 2 MB, and the total MB is twice the number of physical partitions. The first number (4) in the .fs.size file represents the PP size. If you want to reduce the size of /usr from 40 MB to 32 MB, edit the /usr entry to: rootvg 4 hd2 /usr 8 32 jfs IMPORTANT: Make sure that you DO NOT enter a value which is less than the size of the filesystem required to contain the current data. Doing so will cause the re-installation procedure to fail. 5) chdev -l rmt0 -a block=512 -T 6) Unmount all filesystems that are NOT in the root volume group. 7) Varyoff all user-defined volume groups, if any varyoffvg VGname 8) Export the user-defined volume groups, if any exportvg VGname 9) With a tape in the tape drive, type mksysb /dev/rmt0 This will do a complete system backup, which will include information (in the .fs.size file) for the installation procedure on how large the filesystems are to be created. 10) Follow the instructions in the Installation Kit under "How to Install and perform maintenance from Diskettes" (reportedly now called "BOS Installation from a System Backup") using the diskettes and tape that you created in the previous steps. [ pre AIX 325: DO NOT select the option "Reinstall AIX with Current System Settings". Instead use "Install AIX with Current System Settings" for the logical volume size changes to take affect. ] [ w/ AIX 325: Select "Install from a mksysb image" ] 11) When the installation is complete, you may then import any user-defined volume groups. importvg -y VGname PVname where "VGname" is the name of the volume group, and "PVname" is the name of any one of the physical volumes in the volume group. 12) Varyon your user-defined volume groups varyonvg VGname The reduction of the filesystems is now complete. COMMERCIAL OPTION ----------------- There are also commercial tools availible to help you do this more conviently. I know of one vendor that can be reached at info@compunix.com ------------------------------ Subject: 1.702: How do I make a filesystem larger than 2Gb? AIX 3.2.5 and preceeding versions are limited to 2 Gigabytes per filesystem. With AIX 4.1 IBM allows filesystems up to 64Gb (reference: Individual files are still limited to 2Gb. AIX 4.2 allows 128Gb filesystems and 64 Gb files. (See also question 1.706.) If you are having trouble creating a file greater than 1Mb it maybe because that is the default limit for your account, see 'smit users' or /etc/security/limit. ------------------------------ Subject: 1.703: Chlv warning. Is the first 4k of a LV safe? The first 4k of a raw LV are used to store control block. Applications that write to the raw disk can overwrite this section (common applications that do this are Oracle and Sybase). Commands that call getlvcb will generate a warning but succeed (since the control block exists in ODM. Don't run synclvodm unless you really want to erase the first 4k and replace it with the info from the ODM. shieh@austin.ibm.com (Johnny Shieh) has kindly provided the following explanation: The logical volume control block (lvcb) is the first 512 bytes of a logical volume. This area holds important information such as the creation date of the logical volume, information about mirrored copies, and possible mount points in a journaled filesystem. Certain LVM commands are required to update the lvcb, as part of completeness algorithms in LVM. The old lvcb area is first read and analyzed to see if it is a valid lvcb. If the information is verified as valid lvcb information, then the lvcb is updated. If the information is not valid, then the lvcb update is not performed and the user is given the warning message: Warning, cannot write lv control block data Most of the time, this is a result of database programs accessing the raw logical volumes (and thus bypassing the journaled filesystem) as storage media. When this occurs, the information for the database is literally written over the lvcb. Although this may seem fatal, it is not the case. Once the lvcb has been overwritten, the user can still: 1) Extend a logical volume 2) Create mirrored copies of a logical volume 3) Remove the logical volume 4) Create a journaled filesystem with which to mount the logical volume (note that this will destroy any data sitting in the lvcb area) However, there is a limitation caused by this deletion of the lvcb. The logical volumes with deleted lvcb's face possible, incomplete importation into other AIX systems. During an "importvg", the LVM command will scan the lvcb's of all defined logical volumes in a volume group for information concerning the logical volumes. Surprisingly, if the lvcb is deleted, the imported volume group will still define the logical volume to the new AIX system which is accessing this volume group, and the user can still access the raw logical volume. However, any journaled filesystem information is lost and the logical volume and its associated mount point won't be imported into the new AIX system. The user must create new mount points and the availability of previous data stored in the filesystem is NOT assured. Also, during this import of a logical volume with an erased LVCB, some non-jfs information concerning the logical volume, which is displayed with the "lslv" command, cannot be found. When this occurs, the system uses default logical volume information to populate the logical volume's ODM information. Thus, some output from the "lslv" will be inconsistent with the real logical volume. If logical volume copies still exist on the original disks, this information will not be correctly reflected in the ODM database. The user should use "rmlvcopy" and "mklvcopy" to rebuild any logical volume copies and synchronize the ODM. Finally, with an erased lvcb, the output from the "lslv" command might be misleading or unreliable. ------------------------------ Subject: 1.704: What's the limit on Physical Partitions Per Volume Group? From: shieh@austin.ibm.com (Johnny Shieh) 1016 Physical Partitions Per Disk in a Volume Group: In the design of LVM, each Logical Partition maps to one Physical Partition. And, each Physical Partition maps to a number of disk sectors. The design of LVM limits the number of Physical Partitions that LVM can track PER DISK in a volume group to 1016. In most cases, not all the possible 1016 tracking partitions are used by a disk. The default size of each Physical Partition during a "mkvg" command is 4 MB, which implies that individual disks up to 4 GB can be included into a volume group. If a disk larger than 4 GB is added to a volume group (based on usage of the default 4 MB size for Physical Partition) the disk addition will fail with a warning message that the Physical Partition size needs to be increased.* There are two instances where this limitation will be enforced. The first case is when the user tries to use "mkvg" to create a volume group where the number of physical partitions on one of the disks in the volume group would exceed 1016. In this case, the user must pick from the available Physical Partition ranges of: 1, 2, (4), 8, 16, 32, 64, 128, 256 Megabytes and use the "-s" option to "mkvg". The second case is where the disk which violates the 1016 limitation is attempting to join a pre-existing volume group with the "extendvg" command. The user can either recreate the volume group with a larger Physical Partition size (which will allow the new disk to work with the 1016 limitation) or the user can create a standalone volume group (consisting of a larger Physical Partition size) for the new disk. In AIX 4.1 and 3.2.5, if the install code detects that the rootvg drive is larger than 4 GB, it will change the "mkvg -s" value until the entire disk capacity can be mapped to the available 1016 tracks.** This install change also implies that all other disks added to rootvg, regardless of size, will also be defined at that new Physical Partitions size. For RAID systems, the /dev/hdiskX name used by LVM in AIX may really consist of many non-4GB disks. In this case, the 1016 limitation still exists. LVM is unaware of the size of the individual disks that may really make up /dev/hdiskX. LVM bases the 1016 limitation on the AIX recognized size of /dev/hdiskX, and not the real independent physical disks that make up /dev/hdiskX. The questions asked of this issue are: 1) What are the symptoms of this problem? 2) How safe is my data? What if I never use mirroring or migratepv? 3) Can I move this volume group between RS/6000 systems and versions of AIX? Here are the answers: A) What are the symptoms of this problem? The 1016 VGSA is used to track the "staleness of mirrors". If you are in violation of 1016, you may possibly get a false report of a non-mirrored logical volume being "stale" (which is an oxymoron) or you may get a false indication that one of the your mirror copies has gone stale. Next, migratepv may fail because migratepv briefly uses mirroring to move a logical volume from one disk to another. If the target logical partition is incorrectly considered "stale", then the migratepv cannot remove the source logical partition and the migratepv command will fail in the middle of migration. B) How safe is my data? What if I never use mirroring or migratepv? The data is as safe (in your mind) as the day before you found out about 1016 violations. The only case where data may be lost is if one is mirroring a logical volume and ALL copies go bad at the same time and LVM isn't aware of it because the copies that go bad are beyond the 1016 tracking range. However, in this case, you would lose data even if you were within the 1016 range. If you never mirror or use migratepv, then this issue shouldn't concern you. But, it might be unwise to state you'll NEVER use either of those options. C) Can I move this volume group between RS/6000 systems and versions of AIX? Yes you can. The enforcement of this 1016 limit is only during mkvg and extendvg. The "safeness" of the data on the volume group on AIX 3.2 is the same as it is on AIX 4.1. * This bug was fixed in apar ix48926. Current AIX 3.2.5 and 4.1.1, which do not have this fix on applied, will allow the creation of volume groups with more than 1016 partitions. The implication of this bug allowing more than 1016 physical partitions is that the user may access all portions of the logical volume. However during disk mirroring, the status of partitions beyond the 1016 limit will not be tracked correctly. If mirrors beyond the 1016 range become "stale", LVM will not be aware of their condition and data consistency may become an issue for those partitions. Additionally, the "migratepv" command creates mirrors and deletes them as a method for moving logical volumes around within/between disks. If the 1016 limit is violated, then the "migratepv" command may not behave correctly. The user should pick up apar ix51754, which clarifies the error message when this condition is detected. Additionally, the user can read the non-ptf documentation apar ix50874 which is a companion to ix48926 and ix51754. ** This bug was fixed for AIX 3.2.5 rootvg install in apars ix46862 and ix46863. This bug does not exist in AIX 4.1.1. ------------------------------ Subject: 1.705: Why am I having trouble adding another disk to my VG? From: shieh@austin.ibm.com (Johnny Shieh) In some instances, the user will experience a problem adding a new disk to an existing volume group or in the creation of a new volume group. The warning message provided by LVM will be: Not enough descriptor space left in this volume group. Either try adding a smaller PV or use another volume group. On every disk in a volume group, there exists an area called the Volume Group Descriptor Area (VGDA). This space is what allows the user to take a volume group to another AIX system and "importvg" that volume group into that AIX system. The VGDA contains the names of disks that make up the volume group, their physical sizes, partition mapping, logical volumes that exist in the volume group, and other pertinent LVM management information. When the user creates a volume group, the "mkvg" command defaults to allowing the new volume group to have a maximum of 32 disks in a volume group. However, as bigger disks have become more prevalent, this 32 disk limit is usually not achieved because the space in the VGDA is used up faster, as it accounts for the capacity on the bigger disks. This maximum VGDA space, for 32 disks, is a fixed size which is part of the LVM design. Large disks require more management mapping space in the VGDA, which causes the number and size of available disks to be added to the existing volume group to shrink. When a disk is added to a volume group, not only does the new disk get a copy of the updated VGDA, but all existing drives in the volume group must be able to accept the new, updated VGDA. The exception to this description of the maximum VGDA is rootvg. In order to provide AIX users more free space, when rootvg is created, "mkvg" does not use the maximum limit of 32 disks that are allowed into a volume group. Instead in AIX 3.2, the number of disks picked in the install menu of AIX is used as the reference number by "mkvg -d" during the creation of rootvg. For AIX 4.1, this "-d" number is 7 for one disk and one more for each additional disk picked. i.e. you pick two disks, the number is 8. you pick three disks, the number is 9, and so on..... This limit does not mean the user cannot add more disks to rootvg in the post-install phase. The amount of free space left in a VGDA, and thus the number of size of the disks added to a volume group, depends on the size and number of disks already defined for a volume group. However, this smaller size during rootvg creation implies that the user will be able to add fewer disks to rootvg than compared to a non-rootvg volume group. If the customer requires more VGDA space in the rootvg, then they should use the "mksysb" and "migratepv" commands to reconstruct and reorganize their rootvg (the only way to change the "-d" limitation is recreation of the rootvg). Note: It is always strongly recommended that users do not place user data onto rootvg disks. This separation provides an extra degree of system integrity. ------------------------------ Subject: 1.706: What are the limits on a file, filesystem? There are other limits but these come up most often. Logical Volumes do not _have_ to contain Journaled File Systems and therefore can be larger than 2GB even in 3.2.5. File jfs-Filesystem 3.2.5 2GB 2GB 4.1.x 2GB 64GB 4.2 64GB 128GB While it *might* be possible to create larger file systems, the limits shown here represent values that IBM has supposedly tested. ------------------------------ Subject: 1.707: Hints for Segate 9 GB and other disks larger than 4 GB? [read 1.704] ------------------------------ Subject: 1.708: How do I fix Volume Group Locked? From /usr/lpp/bos/README (AIX 3.2.5) and 1.800.IBM.4FAX #2809 If you get '0516-266 publvodm: volume group rootvg is locked, try again' or something similar, you can use (putlvodm -K `getlvodm -v `) ------------------------------ Subject: 1.709: How do I remove a volume group with no disks? From: shieh@austin.ibm.com (Johnny Shieh) This is a very common question about AIX LVM and I thought I might take some time to explain what is going on. Within a volume group is the Volume Group Descriptor Area (VGDA) is is kinda a "suitcase" of lvm information. This is what allows you to pick up your drives and take them to another machine, importvg them, and get filesystems automatically defined. What happens is that when you importvg the volume group, the RS/6000 goes out and reads the VGDA and finds out about all the logical volumes and filesystems that may exist on the volume group. It then checks for clashes (name conflicts, etc..) on its own machine and then, here is the important part, populates its own database with information about the new volume group and its associated logical volumes. In cases of filesystems, it will go into the /etc/filesystems file and add the new filesystem entries that came along with the imported volume group. Okay, the key point is that you've got this independent volume group that has "docked" at the new RS/6000. What keeps the two tethered to each other is the varyonvg command. When this is started on the volume group, a software link is created where you can't separate the volume group from the AIX operating system unless the volume group is no longer seen as active by the system. In very rare cases, a situation can occur where the VGDA thinks that someone has it (the volume group) activated, but the operating system doesn't think it has the volume group opened up. This is pretty rare. The main question I see is "I've taken away the disks, but how do I get rid of the volume group". The question should really say, "How do I get rid of the volume group INFORMATION" since that's all you have on the system. You've got possible entries in the /etc/filesystems and definitely entries in the ODM. Just do: exportvg It does a reverse importvg, except it doesn't go off and read the VGDA. It nukes anything relating to the volume group in the /etc/filesystems and ODM. The only time this won't work is if the system detects that the volume group is varied on. Then, it would be like trying to change tires on a moving car, we won't let you do it! Some people are concerned that doing an exportvg will somehow damage the volume group and/or its VGDA. As I said, all it does is affect the information about the volume group on the RS/6000 box, not on the actual disk platter itself. Thus, the volume group you exported is safe to take to another system. The only time the VGDA gets overwritten is when you create a new volume on top of it. The second most often asked question is "How do I get rid of a disk that is no longer really in the volume group?" In this case, you DON'T want to do an exportvg. What you want to do is tell the system you want to cut out the memory of the old, bad disk from the RS/6000 AND from the VGDA of the volume group. You simply do: reducevg -d -f or if the hdname can't be found: reducevg -d -f Be careful with this command. Unlike the exportvg command, actions done with this command WILL affect the VGDA information on the platter. Hope this clarifies some questions about volume groups. ------------------------------ Subject: 1.710: What are the theoritical limits within the LVM? From: Gerry FitzGerald ------------------------------------- LVM Limits within AIX (my perception) ------------------------------------- The system may have 1 to 255 Volumes Groups (VG's). Each VG may contain 1 to 32 Physical Volumes (PV's). Each PV may contain upto 1016 Physical Partitions (PP's). Each PP may have a size (square of 2) from 1 to 256MB (1024MB for AIX 4.3). Therefore, if you can get hold of a 260,096 MB disk (one PV with 1016 x 256MB PPs), you can install 32 of these in a single VG giving you 8,323,072MB per VG. You may have up to 255 VG's in one AIX system so you could (in theory) create the maximum addressable AIX storage area of 2,122,383,360 MB (2,072,640 GB or 2,024 TB or approx. 2 PB). This is based on the current limitations of AIX V4.1. The limits for file and filesystem sizes are: [Editor's note: the original values in this mail appeared to be slightly wrong, I have corrected that to the values as per my interpretation of the AIX manual.] AIX V3.2 Max filesystem size: 2,147,483,647 bytes (2 GB) Max single file size: 2,147,483,647 bytes (2 GB) AIX V4.1 Max filesystem size: 1,099,511,627,776 bytes (1 TB) Max single file size: 2,147,483,647 bytes (2 GB) AIX V4.2 Max filesystem size: 1,099,511,627,776 bytes (1 TB) Max single file size: 68,589,453,312 (~64 GB) AIX V4.3 Max filesystem size: 1,099,511,627,776 bytes (1 TB) Max single file size: 68,589,453,312 (~64 GB) The 1TB maximum file system size is given by the rule that each fragment must be addressable by an 28 bit number, with the largest fragment size being 4096 bytes (4096*2^28). ------------------------------