SAP Sybase IQ: Do not use LF indexes in IQ v16

Really, there isn’t a good reason to use LF indexes in version 16 and higher of IQ. Use HG indexes instead. From Mark Mumy of SAP fame:

Mumy, Mark mark.mumy at sap.com
Mon Aug 17 11:31:37 MST 2015

There is no need with IQ 16 SP08.20 and beyond to use an LF. The HG index has been augmented over the past few years to take on the same characteristics for low cardinality data while adding parallelism and other join/sort features that just don’t exist in the LF.

Remove all LF indexes and replace them with HGs, if they don’t have one. There is no need for an HNG as the LF (don’t use) and HG have many of those features and can cost less.

Mark

Mark Mumy
Director, Enterprise Architecture, Global HANA CoE | SAP
M +1 347-820-2136 | E mark.mumy at sap.com
My Blogs: http://scn.sap.com/people/markmumy/blog

Share Button

ERROR: SQLite database is malformed – SOLVED

Every once in a while if SQLiteyou’re making heavy use of a SQLite database, as with any other DBMS, it will become corrupted. Usually this is due to bugs within the DBMS itself and sometimes hardware failure.

I had a SQLite database go corrupt when an application crashed. Normally I would have a backup to go to but in this case I didn’t because it was a new application that I was in the process of setting up.

Attempting to list tables in the database using sqlite3 resulted in the following message:

Error: near line 1: database disk image is malformed

In SQLite, the consistency checker is pragma integrity_check

This pragma does an integrity check of the entire database. The integrity_check pragma looks for out-of-order records, missing pages, malformed records, missing index entries, and UNIQUE and NOT NULL constraint errors. If the integrity_check pragma finds problems, strings are returned (as multiple rows with a single column per row) which describe the problems. Pragma integrity_check will return at most N errors before the analysis quits, with N defaulting to 100. If pragma integrity_check finds no errors, a single row with the value ‘ok’ is returned.

PRAGMA integrity_check does not find FOREIGN KEY errors. Use the PRAGMA foreign_key_check command for to find errors in FOREIGN KEY constraints.

sqlite> pragma integrity_check;
Error: database disk image is malformed

Yeah, it’s effectively a boolean response. OK if all is good else “database disk image is malformed”. Not very helpful IMO.

So what to do? Obviously, reverting to an earlier backup is preferred but if you remember, I didn’t have one and I didn’t want to redo all that work.

We need to export the schema and data and create a new database, import the schema and data. Simple:

sqlite> .mode insert
sqlite> .output mydb_export.sql
sqlite> .dump
sqlite> .exit

The .output sets the name of the file that the output of following commands to mydb_export.sql.
The .dump will export the database in SQL format.

Now, move the mydb.db file to mydb.db.original.

mv mydb.db mydb.db.original

We are almost done. Let’s create the database and load the exported schema & data:

sqlite3 mydb.db < mydb_export.sql

We have one last step. If the database had indexes, and all databases should have indexes, you will want to update the statistics for the indexes:

sqlite> analyze;
sqlite> .exit
Share Button

Confused on which SAP Sybase IQ index to use and why?

SAP Sybase IQ Indexes and Indexing Techniques

by Mark Mumy

Are you new to SAP Sybase IQ? Have you used the product but are still not clear on what indexes to use or when to use them? Have you just migrated to SAP Sybase IQ 16 and are wondering if there are any indexing changes? This blog will discuss indexes and strategies for SAP Sybase IQ.

Basic Indexes
When creating a table in IQ, a default index is placed on every column. This index is referred to as a Fast Projection (FP) index. In order to enable optimized FP indexes, one must set the Minimize_Storage option to ON.
set option public.minimize_storage=’on’;

Read the entire post on Mark Mumy’s blog

Share Button

Sybase ASE: Helping indexes work better

Drew Montgomery wrote up an excellent description of how to use indexes more wisely:
There are times when it seems that the optimizer just doesn’t “get it”. In other words, it doesn’t use what seems like an obvious choice for an index. There are several possible reasons for that, and we are going to explore a couple of them here…

The first could be that the index in question is using old statistics. The statistics for an index are used when evaluating queries to help the optimizer determine the cost of using a particular index. Each index for a table is evaluated by the optimizer along with some other complex interactions to determine if an index (or set of indices if we are dealing with multiple tables) should be used. If the statistics for an index are out of date, old, tired, worn out, they may no longer accurately represent the distribution of the data. The optimizer, using the older statistics, may determine that another course of action is best. This is why we would want to periodically update the statistics. Of course, this isn’t a cure-all for query woes. Far from it! But it can be a cause of some optimizer confusion.

Another reason could be that the query itself doesn’t let us use an index. You may be asking yourself “what could I have done to make the optimizer mad at me?” It’s not personal. It may just be the way you have rendered your query. One little known fact is that if you have an aggregate function against a column, the optimizer can’t use that column from an index. In other words, if you have a query that looks like this:

select * from local_users
where upper(name) like "%SMITH%"

You have prevented the optimizer from using an index based on name. The act of upper casing the column prevents the optimizer from using it. The same would be true if you had used an arithmetic function or even an equation:

select * from accounts
where account_limit * 10 < expected_account_limit[/sql] In the above case, the problem will be that the optimizer could use expected_account_limit OR account_limit but in no case could it use both. You would have to reform the query to move the *10 to the other side of the equation to allow the optimizer to use the account_limit column. In the same vein, if there is a data type mismatch between the columns being used in a query, the optimizer will have to make the column types "match" before it can complete the query. This means that the optimizer is applying an "implied" aggregate function (convert) to align the data types. If you have the following query: [sql]select p.name, p.address, j.title from personnel p, jobs j where p.j_id = j.j_id[/sql] and the column j_id in personnel allows nulls (for those people who are working under cover!), and j_id in jobs does not allow nulls since that is the primary key, the optimizer will not use an index based on the j_id column in the jobs table. In order to "match" the data types, it has to convert the j_id column in the jobs table from "int not null" to "int null" in order to match the definition of the j_id column from the personnel table. Sometime the difference in the columns could be a subtle as a varchar(10) vs. a char(10) or a varchar(20) vs. a varchar(21). In each case, the optimizer will make a conversion to the broadest common denominator to make the match, and in many cases that conversion will prevent the optimizer from using a given index. The last possible that I will discuss is the case where the wrong index exists, or a better index could be created. There are two distinct cases for this one. The first is where an index is given an advantage of some kind. That index could be in the form of the index being unique, or if the index is clustered. That advantage could be enough to sway the optimizer to use a specific index. Another case could be that even though the query is using an index, another index may be more efficient. I recently ran across this specific case. A query was using four different columns in its where clause, and two of those columns in an order by clause. The dynamic of the query was trying to retrieve the data on the basis of the date of the record, in descending order (I.E. they wanted the "youngest" record first) and in fact they had a restriction in the select to get the "top 1" record based on these criteria. A facsimile of the query is as follows: [sql]select top 1 name, balance_date, amount from bank_info where name = @name and account_type = @type and branch_id = @branch and balance_date < @eval_date order by name, balance_date desc[/sql] The query was using an index that had two of the four columns involved. Although this was better than a table scan (table had 82 million rows), it was not hugely efficient. The index did reduce the data set being sifted through, but was still taking 30,000 I/O to complete. Part of the problem was that once the basic query had been satisfied, the result was being put into a sort table that was then interrogated for the youngest record. This is a case for a new index. But the question is: "what does the index go on?" We could build an index against the an id column and the date. That would help the sort, but would not help the where clause a lot. We could build an index on the four columns in the where clause, but the result would still be that the data found would go into a sort table. Probably more efficient, but not as good as it gets. For the particular query, three of the four columns are being matched exactly, with the fourth being the date column's limitation as to what the cut off date is. If the cutoff date is midnight of March 11th, we are looking for the latest record from March 10th and earlier. The best solution is to create an index against the four columns in the were clause, with the date column last and (in the definition of the index) descending. That way when we restrict the other three columns, we start with the newest record. The other half of this is perhaps a little less obvious: use all of the columns of the new index in the order by clause. The change would be as follows: [sql]select top 1 name, balance_date, amount from bank_info where name = @name and account_type = @type and branch_id = @branch and balance_date < @eval_date order by name, account_type, branch_id, balance_date desc[/sql] That way the optimizer can use the index for the where clause and the order by clause simultaneously. The bottom line is that we go from thousands of I/Os to hundreds of I/O's (with the old index and old order by) finally to a much more reasonable 10-15 I/O. The difference in performance is dramatic. Bottom line, be aware of not only what indices exist, but how they are being used. Make sure to take care of them, and they will be like a good motor oil - they will improve the performance of your system while extending the life of the application.

Share Button