p2p.wrox.com Forums

Need to download code?

View our list of code downloads.


  Return to Index  

beginning_php thread: ch 7 sanitizepath.php


Message #1 by "Ron Mapes" <ron@m...> on Mon, 24 Feb 2003 23:49:37
Does anyone use the code form this script in any of their php file? How do
incoporate it into the code? Would it be just another function placed in
the code referenced by a switch call? I have not come to grips with
securing a directory or files yet so any help or direction will be appreciated.

I would like to secure a directory from general access. Only users that
have a username & passwd will be allowed to access to call the links and
information contained there.

Thanks,
Ron
Message #2 by "Jonathan Lyons" <jlyons@l...> on Wed, 26 Feb 2003 22:52:34
> Does anyone use the code form this script in any of their php file? How do
i> ncoporate it into the code? Would it be just another function placed in
t> he code referenced by a switch call? I have not come to grips with
s> ecuring a directory or files yet so any help or direction will be
appreciated.

> I would like to secure a directory from general access. Only users that
h> ave a username & passwd will be allowed to access to call the links and
i> nformation contained there.

> Thanks,
R> on



It's somewhat comforting to see that someone else is struggling with this
stuff. "ereg" has me completely baffled. Maybe someone can explain to me
(I'm feeling a little stupid here) what is purpose of variable "$trashed"
in the email format example. Is it an array that holds all the $intext data?

I've reread the chapter a few times already and I'm trying to move on to
Chapter 8, but I'm afraid that I've missed something crucial.
Message #3 by "Nikolai Devereaux" <yomama@u...> on Wed, 26 Feb 2003 15:23:18 -0800
Hi Ron and Jonathan,

Could you post the code that's giving you trouble?  I don't have the book,
nor do I feel like downloading all the code for it. =)

Take care,

Nik

Message #4 by "Ron Mapes" <ron@m...> on Thu, 27 Feb 2003 00:50:12
Here is the code.

/sanitizepath.php

function SanitizePath($inpath) {
   $outpath = ereg_replace("\.[\.]+", "", $inpath);
   $outpath = ereg_replace("^[\/]+", "", $outpath);
   $outpath = ereg_replace("^[A-Za-z][:\|][\/]?", "", $outpath);
   return($outpath);
}

function SP($spinpath) { # A wrapper function used for display purposes.
   $spoutpath = SanitizePath($spinpath);
   print("Calling <b>SanitizePath()</b> on \"$spinpath\" yields
         \"$spoutpath\"<br>\n");
}

# Main
SP("/etc/passwd");
SP("myfilename.txt");
SP("mydir1/mydir2/mydir3/somefile.db");
SP("../../../../somefile.txt");
SP(".............../........../mypath/sillyfile.txt");
SP("\windows\win.ini");
SP("C:\windows\system.ini");
SP("c|\windows\control.ini");
SP("C:/some/weird/path/filename.txt");

Ron


Hi Ron and Jonathan,

Could you post the code that's giving you trouble?  I don't have the book,
nor do I feel like downloading all the code for it. =)

Take care,

Nik

Message #5 by "Nikolai Devereaux" <yomama@u...> on Wed, 26 Feb 2003 17:14:12 -0800
Okay, thanks.

Let me explain what SanitizePath() does, line by line.

1) function SanitizePath($inpath) {
2)    $outpath = ereg_replace("\.[\.]+", "", $inpath);
3)    $outpath = ereg_replace("^[\/]+", "", $outpath);
4)    $outpath = ereg_replace("^[A-Za-z][:\|][\/]?", "", $outpath);
5)    return($outpath);
6) }

Replacing a string with "" is essentially removing the original string.
That said, let's get to it:

1)  Declare a function to receive one parameter.
2)  Remove all strings of two or more consecutive dots.
3)  Remove all leading forward slashes.
4)  If the path contains a leading letter (upper or lower case),
    followed by a colon, backslash, or pipe, and optionally
    followed by a forward slash, remove that whole chunk.
5)  Return the modified string.

I think some of those regular expressions are wrong.  Backslashes, if
they're meant to be the literal backslash character, need to be escaped.
"\/" means the single character "/", not the two characters \ and /.  To do
that, you'd need "\\/".

There's a lot more ereg and preg function help online, so I won't go into it
here, but if you're confused about regular expressions, the best place to
look really is online or in a book.  I learned ereg functions from
Professional PHP Programming (for PHP3, though).  I learned the preg way of
doing things from various sources, most notably the PHP preg manual and the
O'Reilly book "Mastering Regular Expressions" (though many REs in the book
are overkill for simple applications).



The other function, SP(), is what's called a "wrapper" function, which means
it's job is primarily to call the other function.  The reason to use SP()
over calling SanitizePath() directly is that it also echos the before and
after strings to the client, which means you don't have to type it every
time.  Perfect example of what a function should be.


The main part simply runs SP() on a bunch of strings to show you how they
are "sanitized".  I won't run them myself, but I can guess their output:


> # Main
> SP("/etc/passwd");

etc/passwd

> SP("myfilename.txt");

(unchanged)

> SP("mydir1/mydir2/mydir3/somefile.db");

(unchanged)

> SP("../../../../somefile.txt");

somefile.txt
   (all ..'s were removed first, leaving you with ////somefile.txt,
    then all leading slashes were removed.)


> SP(".............../........../mypath/sillyfile.txt");

mypath/sillyfile.txt
  (similarly, the two .... strings were removed, leaving you with
   //mypath/sillyfile.txt, then the leading slashes were removed.)


> SP("\windows\win.ini");

windows\win.ini

  (Read my note above -- this would only happen if the regular
   expressions above strip leading BACKslashes.  The code you
   posted will not.  The string being passed is also incorrect --
   the backslashes should be escaped.  As it is, the string is
   parsed as if each "\w" is one character, just like "\n" is
   parsed as a newline character.)


> SP("C:\windows\system.ini");

windows\system.ini
  (same deal as above -- the backslashes SHOULD be escaped)

> SP("c|\windows\control.ini");

windows\control.ini
  (again, same deal.)

> SP("C:/some/weird/path/filename.txt");

some/weird/path/filename.txt
  (the leading C:/ is stripped by line 5 in the function.



Here's how I would rewrite the regular expressions to deal with backslashes
as well as forward slashes:

function SanitizePath($inpath)
{
    $outpath = ereg_replace("\.[\.]+", "", $inpath);
    $outpath = ereg_replace("^[\\/]+", "", $outpath);
    $outpath = ereg_replace("^[A-Za-z][:\|][\\/]?", "", $outpath);
    return($outpath);
}



On to your questions:


*  Does anyone use the code form this script in any of their php file?

Not me.

* How do incoporate it into the code?

Generally speaking, utility functions are defined in include files.  These
are then made available to a php script using the include(), include_once(),
require(), or require_once() directives.  See
  http://www.php.net/include (et al) for mor info.

* Would it be just another function placed in the code
  referenced by a switch call?

I'm not sure what you mean... why would a switch() be necessary?  What are
you comparing?  A switch() stmt is used to execute a block of code, where
the block being executed is selected based on some condition.

While there certainly can be a scenario where a block of code in a switch()
stmt can call SanitizePath(), they are totally unrelated and neither implies
nor requires the use of the other.


ron> I have not come to grips with securing a directory or
ron> files yet so any help or direction will be appreciated.
ron>
ron> I would like to secure a directory from general access.
ron> Only users that have a username & passwd will be allowed
ron> to access to call the links and information contained there.


Okay -- this has *NOTHING* to do with SanitizePath().  SanitizePath() simply
removes potentially dangerous substrings from within a file path, to prevent
a user from attempting to open a file or directory that's beneath the
relative root of the executing script.

What you're talking about is establishing some sort of users and permissions
system on your website.

You're saying that you want a user to login to the site before they're given
the ability to view files.  This is generally accomplished using sessions
and a database to store the valid usernames and passwords.


hope this clears a bunch of stuff up!

Nik

Message #6 by "Ron Mapes" <ron@m...> on Fri, 28 Feb 2003 01:33:17
Nic,
Thanks for the response. It will take a bit to digest. I have printed the
thread to reference as I tinker through my app. I was thinking about how to
secure the files and ended up with a few questions after reading the
section in the book.

Thanks,
Ron

> 
Okay, thanks.

Let me explain what SanitizePath() does, line by line.

1) function SanitizePath($inpath) {
2)    $outpath = ereg_replace("\.[\.]+", "", $inpath);
3)    $outpath = ereg_replace("^[\/]+", "", $outpath);
4)    $outpath = ereg_replace("^[A-Za-z][:\|][\/]?", "", $outpath);
5)    return($outpath);
6) }

Replacing a string with "" is essentially removing the original string.
That said, let's get to it:

1)  Declare a function to receive one parameter.
2)  Remove all strings of two or more consecutive dots.
3)  Remove all leading forward slashes.
4)  If the path contains a leading letter (upper or lower case),
    followed by a colon, backslash, or pipe, and optionally
    followed by a forward slash, remove that whole chunk.
5)  Return the modified string.

I think some of those regular expressions are wrong.  Backslashes, if
they're meant to be the literal backslash character, need to be escaped.
"\/" means the single character "/", not the two characters \ and /.  To do
that, you'd need "\\/".

There's a lot more ereg and preg function help online, so I won't go into it
here, but if you're confused about regular expressions, the best place to
look really is online or in a book.  I learned ereg functions from
Professional PHP Programming (for PHP3, though).  I learned the preg way of
doing things from various sources, most notably the PHP preg manual and the
O'Reilly book "Mastering Regular Expressions" (though many REs in the book
are overkill for simple applications).



The other function, SP(), is what's called a "wrapper" function, which means
it's job is primarily to call the other function.  The reason to use SP()
over calling SanitizePath() directly is that it also echos the before and
after strings to the client, which means you don't have to type it every
time.  Perfect example of what a function should be.


The main part simply runs SP() on a bunch of strings to show you how they
are "sanitized".  I won't run them myself, but I can guess their output:


> # Main
> SP("/etc/passwd");

etc/passwd

> SP("myfilename.txt");

(unchanged)

> SP("mydir1/mydir2/mydir3/somefile.db");

(unchanged)

> SP("../../../../somefile.txt");

somefile.txt
   (all ..'s were removed first, leaving you with ////somefile.txt,
    then all leading slashes were removed.)


> SP(".............../........../mypath/sillyfile.txt");

mypath/sillyfile.txt
  (similarly, the two .... strings were removed, leaving you with
   //mypath/sillyfile.txt, then the leading slashes were removed.)


> SP("\windows\win.ini");

windows\win.ini

  (Read my note above -- this would only happen if the regular
   expressions above strip leading BACKslashes.  The code you
   posted will not.  The string being passed is also incorrect --
   the backslashes should be escaped.  As it is, the string is
   parsed as if each "\w" is one character, just like "\n" is
   parsed as a newline character.)


> SP("C:\windows\system.ini");

windows\system.ini
  (same deal as above -- the backslashes SHOULD be escaped)

> SP("c|\windows\control.ini");

windows\control.ini
  (again, same deal.)

> SP("C:/some/weird/path/filename.txt");

some/weird/path/filename.txt
  (the leading C:/ is stripped by line 5 in the function.



Here's how I would rewrite the regular expressions to deal with backslashes
as well as forward slashes:

function SanitizePath($inpath)
{
    $outpath = ereg_replace("\.[\.]+", "", $inpath);
    $outpath = ereg_replace("^[\\/]+", "", $outpath);
    $outpath = ereg_replace("^[A-Za-z][:\|][\\/]?", "", $outpath);
    return($outpath);
}



On to your questions:


*  Does anyone use the code form this script in any of their php file?

Not me.

* How do incoporate it into the code?

Generally speaking, utility functions are defined in include files.  These
are then made available to a php script using the include(), include_once(),
require(), or require_once() directives.  See
  http://www.php.net/include (et al) for mor info.

* Would it be just another function placed in the code
  referenced by a switch call?

I'm not sure what you mean... why would a switch() be necessary?  What are
you comparing?  A switch() stmt is used to execute a block of code, where
the block being executed is selected based on some condition.

While there certainly can be a scenario where a block of code in a switch()
stmt can call SanitizePath(), they are totally unrelated and neither implies
nor requires the use of the other.


ron> I have not come to grips with securing a directory or
ron> files yet so any help or direction will be appreciated.
ron>
ron> I would like to secure a directory from general access.
ron> Only users that have a username & passwd will be allowed
ron> to access to call the links and information contained there.


Okay -- this has *NOTHING* to do with SanitizePath().  SanitizePath() simply
removes potentially dangerous substrings from within a file path, to prevent
a user from attempting to open a file or directory that's beneath the
relative root of the executing script.

What you're talking about is establishing some sort of users and permissions
system on your website.

You're saying that you want a user to login to the site before they're given
the ability to view files.  This is generally accomplished using sessions
and a database to store the valid usernames and passwords.


hope this clears a bunch of stuff up!

Nik


  Return to Index