|
|
|
<?php
//Sitesearch v1.1 - Sheridan Saint-Michel with small modification by Jorge Luis Loza
//Begin Environment Variables. You need to set these!//
//List of directories to be included in the search.
//Usage "path" => "url" (path to directory and URL which corresponds to directory)
//remember to include the trailing / on the URL
$directories = array(
"/home2/MiniWeb/jorge/dir1" => "http://www.yourdomain.org/dir1/",
"/home2/MiniWeb/jorge/dir2" =>"http://www.yourdomain.org/dir2");
//End Environment Variables//
Function Keyword_Check($filenames,$keywords)
{
for($i = 0; $i < count($filenames); $i++)
{
$filename = $filenames[$i];
$match = 0;
$title = " ";
$fd = fopen($filename, "r");
$contents = fread($fd, filesize ($filename));
fclose($fd);
//Find title of File
$pos = strpos ($contents,"<title>")+7;
$pos2 = strpos ($contents,"</title>")-$pos;
$title = substr("$contents",$pos,$pos2);
//Remove HTML Tags before searching
$search = array ("'<script[^>]*?>.*?</script>'si", // Strip out javascript
"'<[\/\!]*?[^<>]*?>'si", // Strip out html tags
"'([\r\n])[\s]+'", // Strip out white space
"'&(quot|#34);'i", // Replace html entities
"'&(amp|#38);'i",
"'&(lt|#60);'i",
"'&(gt|#62);'i",
"'&(nbsp|#160);'i",
"'&(iexcl|#161);'i",
"'&(cent|#162);'i",
"'&(pound|#163);'i",
"'&(copy|#169);'i",
"'&#(\d+);'e"); // evaluate as php
$replace = array ("",
" ",
"\\1",
"\"",
"&",
"<",
">",
" ",
chr(161),
chr(162),
chr(163),
chr(169),
"chr(\\1)");
$contents = preg_replace ($search, $replace, $contents);
$contents = preg_replace ("/\W/", " ", $contents);
$contents = preg_replace ("/\s+/", " ", $contents);
//Seperate Each Word into an Array Element and Compare to Keywords
$contents = explode(" ", $contents);
$j = 0;
for($j = 0; $j < count($keywords); $j++)
{
for($k=0 ; $k < count($contents); $k++)
{
//compare contents with each keyword
if (!strcasecmp ($contents[$k], $keywords[$j]))
{
$match++;
break;
}
}
}
if ($match == count($keywords) )
$retVal[count($retVal)] = "$filename--*--$title";
}
return $retVal;
}
function Get_Filenames($directory)
{
//Load Directory Into Array
$handle=opendir($directory);
while ($file = readdir($handle))
{
if ($file != "." && $file != ".." && $file !=".mhonarc.db" && $file !="index.htm" && $file !
="threads.html" && substr($file,-3,3) !=".gz")
$retVal[count($retVal)] = $file;
}
//Clean up and sort
closedir($handle);
sort($retVal);
return $retVal;
}
if ( isset($keyword) )
{
$keywords = explode(" ", $keyword);
$pages = array();
while (list ($key, $val) = each ($directories))
{
$directory = $key;
chdir($directory) or die("Directory $directory Not found");
$filenames = Get_Filenames($directory);
$found = Keyword_Check($filenames,$keywords);
//add any pages with keywords in current directory to array
for($i = 0;$i < count($found); $i++)
{
$add = "$val$found[$i]";
$pages[count($pages)] = $add;
}
}
?>
<HTML>
<Head>
<Title>Website Search</Title>
</Head>
<?php } ?>
<Body>
<Form Action=search.php>
<Input Type=Text Name=keyword Value="<?php echo $keyword ?>";>
<Input Type=Submit Value=Search>
<Input Type=Reset Value="New Search">
<?php
if ( isset($keyword) )
{
echo "<HR>\n";
echo "<Font Color=Blue>$numfound pages matching your query were found</Font>";
echo "<HR>";
for ($i = 0; $i < count($pages); $i++)
//explode filename / title
$separamos=explode("--*--",$pages[$i]);
echo "<A HREF=\"$separamos[0]\">$separamos[1]</A><BR>";
}
?>
</Form>
</Body>
</HTML>
|
|
| Boolean Keyword Interpreter Categories : PHP, Algorithms, Search Engines | | | UDMSearch - a free search engine, indexing system. Categories : Search Engines, Linux, PHP, MySQL, ODBC | | | SubmitForce URL power submitter (searchengine submission class) Categories : PHP, Search Engines, URLs, PHP Classes | | | How to build a search query for any N number of words in a search string Categories : PHP, Regexps, Search Engines, Search | | | Sitmap Generator PHP class Categories : PHP, PHP Classes, Search Engines, Site Planning | | | Dynamic pages with no "?" Categories : PHP, Search Engines | | | SiteSearch 1.1:
This app lets end users search your site for keywords. You specify which directories should be included in the search. Categories : Search, Search Engines, PHP | | | Query2Report : Generating Html, Pdf and Csv Reports from SQL Query Categories : PHP, PHP, HTML, PDF, Excel | | | PageRank Display Categories : Search Engines, HTML and PHP, PHP | | | http://phpMySearch.web4.hm - The phpMySearch search engine system is a completeworld wide web indexing and searching system for a small domain or intranet. Categories : Search Engines, PHP, Databases, MySQL | | | Retrieve text from table and email to your e-
address in pipe delimited format. Categories : PHP, MySQL | | | Accepts a database & hostname from a user and then HTTP username and password. Uses this to connect to a MySQL database. Produces a form based on the tables it finds there to allow the user to do SELECTs, INSERTs, and DELETEs. Categories : Databases, PHP, MySQL, Complete Programs | | | PHP Script to find url links in a page Categories : PHP, URLs, Regexps, Arrays | | | Using $PHP_AUTH_USER and $PHP_AUTH_PW to authenticate. Categories : Authentication, PHP | | | very simple ftp class Categories : PHP, PHP Classes, FTP | |
| | | | eliteral chheda wrote : 645
Hi,
I found your script highly useful and awsome.
Could you tell me how could I query in such a way that I get 30 results per page.
Thanks.
Regards,
Nainil Chheda.
http://www.eliteral.com
| | | | Sheridan Saint-Michel wrote : 671
Nice modification! I probably would have just done something like
preg_match("|<Title>(.+)</Title>|Ui", $contents, $regs );
$title = $regs[1];
Rather than mucking with all that strpos stuff, but like the PERL folks say "There`s More Than One Way to Do It" =)
| | | | Sheridan Saint-Michel wrote :672
One thing, though, I was looking over the script one problem is the strpos substr method, unlike the preg method, is case sensitive. Try changing one of the HTML pages it is searching so you have <TITLE>blah</TITLE> instead of <title>blah</title>. I just tried it and it broke.
If you use the preg I posted in the previous comment though everything thing works nicely. All in all you did a good job =P
Sheridan
| |
|
|
|